Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory43.3 B

Variable types

Categorical1
Text2
Numeric2

Alerts

강수량(mm) is highly overall correlated with 시도명High correlation
시도명 is highly overall correlated with 강수량(mm)High correlation

Reproduction

Analysis started2023-12-10 10:53:45.994318
Analysis finished2023-12-10 10:53:47.892393
Duration1.9 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
31 
경상북도
23 
강원도
18 
경상남도
18 
전라남도
10 

Length

Max length4
Median length4
Mean length3.51
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 31
31.0%
경상북도 23
23.0%
강원도 18
18.0%
경상남도 18
18.0%
전라남도 10
 
10.0%

Length

2023-12-10T19:53:47.998121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:53:48.175961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 31
31.0%
경상북도 23
23.0%
강원도 18
18.0%
경상남도 18
18.0%
전라남도 10
 
10.0%
Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:53:48.649394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.03
Min length3

Characters and Unicode

Total characters303
Distinct characters85
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)98.0%

Sample

1st row정선군
2nd row평창군
3rd row영월군
4th row횡성군
5th row홍천군
ValueCountFrequency (%)
고성군 2
 
2.0%
경산시 1
 
1.0%
김해시 1
 
1.0%
영천시 1
 
1.0%
영주시 1
 
1.0%
구미시 1
 
1.0%
안동시 1
 
1.0%
경주시 1
 
1.0%
포항시 1
 
1.0%
김천시 1
 
1.0%
Other values (89) 89
89.0%
2023-12-10T19:53:49.292549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 303
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 303
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 303
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
54
17.8%
49
 
16.2%
14
 
4.6%
12
 
4.0%
11
 
3.6%
9
 
3.0%
8
 
2.6%
6
 
2.0%
5
 
1.7%
5
 
1.7%
Other values (75) 130
42.9%

강수량(mm)
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.627
Minimum3
Maximum78.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:53:49.509811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile5.88
Q18.875
median12.9
Q318.05
95-th percentile35.44
Maximum78.1
Range75.1
Interquartile range (IQR)9.175

Descriptive statistics

Standard deviation10.496828
Coefficient of variation (CV)0.67171103
Kurtosis11.974139
Mean15.627
Median Absolute Deviation (MAD)4.55
Skewness2.7104885
Sum1562.7
Variance110.18341
MonotonicityNot monotonic
2023-12-10T19:53:50.120884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.1 4
 
4.0%
16.3 3
 
3.0%
15.5 3
 
3.0%
17.8 3
 
3.0%
11.1 3
 
3.0%
6.8 2
 
2.0%
14.7 2
 
2.0%
21.3 2
 
2.0%
16.7 2
 
2.0%
11.7 2
 
2.0%
Other values (66) 74
74.0%
ValueCountFrequency (%)
3.0 1
1.0%
4.9 1
1.0%
5.0 1
1.0%
5.1 1
1.0%
5.5 1
1.0%
5.9 1
1.0%
6.4 2
2.0%
6.5 1
1.0%
6.6 1
1.0%
6.8 2
2.0%
ValueCountFrequency (%)
78.1 1
1.0%
39.7 1
1.0%
37.2 1
1.0%
37.0 1
1.0%
36.2 1
1.0%
35.4 1
1.0%
35.2 1
1.0%
34.5 1
1.0%
33.0 1
1.0%
32.2 1
1.0%

평년(mm)
Real number (ℝ)

Distinct81
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.641
Minimum16.3
Maximum114.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:53:50.338322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16.3
5-th percentile18.48
Q120.775
median23.75
Q329.725
95-th percentile40.595
Maximum114.2
Range97.9
Interquartile range (IQR)8.95

Descriptive statistics

Standard deviation11.045216
Coefficient of variation (CV)0.41459463
Kurtosis39.743861
Mean26.641
Median Absolute Deviation (MAD)3.85
Skewness5.292122
Sum2664.1
Variance121.99679
MonotonicityNot monotonic
2023-12-10T19:53:50.581650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30.9 3
 
3.0%
19.4 3
 
3.0%
21.2 3
 
3.0%
20.9 2
 
2.0%
24.5 2
 
2.0%
23.5 2
 
2.0%
22.6 2
 
2.0%
19.9 2
 
2.0%
24.2 2
 
2.0%
22.0 2
 
2.0%
Other values (71) 77
77.0%
ValueCountFrequency (%)
16.3 1
 
1.0%
17.0 1
 
1.0%
17.2 1
 
1.0%
17.8 1
 
1.0%
18.1 1
 
1.0%
18.5 1
 
1.0%
18.9 1
 
1.0%
19.0 1
 
1.0%
19.3 1
 
1.0%
19.4 3
3.0%
ValueCountFrequency (%)
114.2 1
1.0%
45.6 1
1.0%
45.0 1
1.0%
44.2 1
1.0%
42.4 1
1.0%
40.5 1
1.0%
39.6 1
1.0%
39.3 1
1.0%
38.1 1
1.0%
35.4 1
1.0%
Distinct69
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:53:50.908312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.28
Min length1

Characters and Unicode

Total characters428
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)49.0%

Sample

1st row15.0
2nd row16.8
3rd row7.9
4th row8.2
5th row6.9
ValueCountFrequency (%)
9
 
9.0%
16.8 3
 
3.0%
12.8 3
 
3.0%
13.0 3
 
3.0%
8.2 3
 
3.0%
14.2 2
 
2.0%
12.6 2
 
2.0%
11.1 2
 
2.0%
5.9 2
 
2.0%
3.4 2
 
2.0%
Other values (59) 69
69.0%
2023-12-10T19:53:51.611552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 91
21.3%
91
21.3%
1 62
14.5%
2 30
 
7.0%
3 25
 
5.8%
8 23
 
5.4%
6 21
 
4.9%
4 19
 
4.4%
9 19
 
4.4%
0 16
 
3.7%
Other values (3) 31
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 237
55.4%
Other Punctuation 91
 
21.3%
Space Separator 91
 
21.3%
Dash Punctuation 9
 
2.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 62
26.2%
2 30
12.7%
3 25
10.5%
8 23
 
9.7%
6 21
 
8.9%
4 19
 
8.0%
9 19
 
8.0%
0 16
 
6.8%
7 12
 
5.1%
5 10
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 91
100.0%
Space Separator
ValueCountFrequency (%)
91
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 428
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 91
21.3%
91
21.3%
1 62
14.5%
2 30
 
7.0%
3 25
 
5.8%
8 23
 
5.4%
6 21
 
4.9%
4 19
 
4.4%
9 19
 
4.4%
0 16
 
3.7%
Other values (3) 31
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 428
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 91
21.3%
91
21.3%
1 62
14.5%
2 30
 
7.0%
3 25
 
5.8%
8 23
 
5.4%
6 21
 
4.9%
4 19
 
4.4%
9 19
 
4.4%
0 16
 
3.7%
Other values (3) 31
 
7.2%

Interactions

2023-12-10T19:53:47.273539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:46.948186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:47.430346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:53:47.121419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:53:51.764347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군명강수량(mm)평년(mm)부족량
시도명1.0000.8370.7050.4320.771
시군명0.8371.0000.9811.0000.998
강수량(mm)0.7050.9811.0000.7800.000
평년(mm)0.4321.0000.7801.0000.500
부족량0.7710.9980.0000.5001.000
2023-12-10T19:53:51.914346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강수량(mm)평년(mm)시도명
강수량(mm)1.0000.3420.567
평년(mm)0.3421.0000.363
시도명0.5670.3631.000

Missing values

2023-12-10T19:53:47.631412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:53:47.816693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군명강수량(mm)평년(mm)부족량
0강원도정선군5.920.915.0
1강원도평창군11.728.516.8
2강원도영월군9.117.07.9
3강원도횡성군12.420.68.2
4강원도홍천군13.620.56.9
5강원도삼척시5.134.529.4
6강원도양양군9.142.433.3
7강원도고성군9.926.716.8
8강원도인제군7.219.412.2
9강원도양구군4.917.212.3
시도명시군명강수량(mm)평년(mm)부족량
90전라남도화순군31.526.1-
91전라남도장흥군35.229.1-
92전라남도강진군37.029.7-
93전라남도해남군35.432.2-
94전라남도영암군34.532.2-
95전라남도무안군33.035.42.4
96전라남도함평군37.238.10.9
97전라남도영광군36.239.63.4
98전라남도장성군32.239.37.1
99전라남도완도군39.733.5-