Overview

Dataset statistics

Number of variables4
Number of observations115
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 KiB
Average record size in memory35.1 B

Variable types

Categorical1
Numeric2
Text1

Dataset

Description전국 권역별 지하수 이용량 정보를 아래와 같이 제공합니다.제공정보- 대권역, 중권역코드, 중권역, 이용량(톤/년)
Author한국수자원공사
URLhttps://www.data.go.kr/data/15054534/fileData.do

Alerts

중권역코드 is highly overall correlated with 대권역High correlation
대권역 is highly overall correlated with 중권역코드High correlation
중권역코드 has unique valuesUnique
중권역 has unique valuesUnique
이용량(세제곱미터퍼_년) has unique valuesUnique

Reproduction

Analysis started2024-05-04 07:38:06.211707
Analysis finished2024-05-04 07:38:08.844921
Duration2.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대권역
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)18.3%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
한강
22 
낙동강
22 
금강
14 
섬진강
영산강
Other values (16)
40 

Length

Max length5
Median length4
Mean length3.1130435
Min length2

Unique

Unique5 ?
Unique (%)4.3%

Sample

1st row한강
2nd row한강
3rd row한강
4th row한강
5th row한강

Common Values

ValueCountFrequency (%)
한강 22
19.1%
낙동강 22
19.1%
금강 14
12.2%
섬진강 9
7.8%
영산강 8
 
7.0%
섬진강남해 6
 
5.2%
제주도 4
 
3.5%
낙동강남해 4
 
3.5%
금강서해 3
 
2.6%
낙동강동해 3
 
2.6%
Other values (11) 20
17.4%

Length

2024-05-04T07:38:09.164328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한강 22
19.1%
낙동강 22
19.1%
금강 14
12.2%
섬진강 9
7.8%
영산강 8
 
7.0%
섬진강남해 6
 
5.2%
제주도 4
 
3.5%
낙동강남해 4
 
3.5%
만경동진 3
 
2.6%
영산강서해 3
 
2.6%
Other values (11) 20
17.4%

중권역코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2790.4
Minimum1001
Maximum6004
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-05-04T07:38:09.663785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1006.7
Q12001.5
median2501
Q34004.5
95-th percentile5301.3
Maximum6004
Range5003
Interquartile range (IQR)2003

Descriptive statistics

Standard deviation1455.7369
Coefficient of variation (CV)0.52169471
Kurtosis-0.742838
Mean2790.4
Median Absolute Deviation (MAD)1299
Skewness0.52073891
Sum320896
Variance2119170
MonotonicityStrictly increasing
2024-05-04T07:38:10.213082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1001 1
 
0.9%
3013 1
 
0.9%
4004 1
 
0.9%
4003 1
 
0.9%
4002 1
 
0.9%
4001 1
 
0.9%
3303 1
 
0.9%
3302 1
 
0.9%
3301 1
 
0.9%
3203 1
 
0.9%
Other values (105) 105
91.3%
ValueCountFrequency (%)
1001 1
0.9%
1002 1
0.9%
1003 1
0.9%
1004 1
0.9%
1005 1
0.9%
1006 1
0.9%
1007 1
0.9%
1009 1
0.9%
1010 1
0.9%
1011 1
0.9%
ValueCountFrequency (%)
6004 1
0.9%
6003 1
0.9%
6002 1
0.9%
6001 1
0.9%
5303 1
0.9%
5302 1
0.9%
5301 1
0.9%
5202 1
0.9%
5201 1
0.9%
5101 1
0.9%

중권역
Text

UNIQUE 

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2024-05-04T07:38:11.108398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.5478261
Min length2

Characters and Unicode

Total characters408
Distinct characters109
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique115 ?
Unique (%)100.0%

Sample

1st row남한강상류
2nd row평창강
3rd row충주댐
4th row달천
5th row충주댐하류
ValueCountFrequency (%)
남한강상류 1
 
0.9%
남해도 1
 
0.9%
오수천 1
 
0.9%
섬진강댐하류 1
 
0.9%
섬진강댐 1
 
0.9%
새만금 1
 
0.9%
동진강 1
 
0.9%
만경강 1
 
0.9%
금강서해 1
 
0.9%
부남방조제 1
 
0.9%
Other values (105) 105
91.3%
2024-05-04T07:38:12.614733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43
 
10.5%
36
 
8.8%
20
 
4.9%
14
 
3.4%
14
 
3.4%
12
 
2.9%
11
 
2.7%
11
 
2.7%
10
 
2.5%
9
 
2.2%
Other values (99) 228
55.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 408
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
43
 
10.5%
36
 
8.8%
20
 
4.9%
14
 
3.4%
14
 
3.4%
12
 
2.9%
11
 
2.7%
11
 
2.7%
10
 
2.5%
9
 
2.2%
Other values (99) 228
55.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 408
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
43
 
10.5%
36
 
8.8%
20
 
4.9%
14
 
3.4%
14
 
3.4%
12
 
2.9%
11
 
2.7%
11
 
2.7%
10
 
2.5%
9
 
2.2%
Other values (99) 228
55.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 408
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
43
 
10.5%
36
 
8.8%
20
 
4.9%
14
 
3.4%
14
 
3.4%
12
 
2.9%
11
 
2.7%
11
 
2.7%
10
 
2.5%
9
 
2.2%
Other values (99) 228
55.9%

이용량(세제곱미터퍼_년)
Real number (ℝ)

UNIQUE 

Distinct115
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26427422
Minimum1561919.2
Maximum1.2445637 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-05-04T07:38:13.136503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1561919.2
5-th percentile2706795.7
Q110834065
median19093176
Q333624486
95-th percentile69723331
Maximum1.2445637 × 108
Range1.2289445 × 108
Interquartile range (IQR)22790422

Descriptive statistics

Standard deviation24109786
Coefficient of variation (CV)0.91230183
Kurtosis4.5094133
Mean26427422
Median Absolute Deviation (MAD)9838225.5
Skewness1.9634088
Sum3.0391535 × 109
Variance5.8128176 × 1014
MonotonicityNot monotonic
2024-05-04T07:38:14.081687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17473069.3 1
 
0.9%
33003428.16 1
 
0.9%
17349927.83 1
 
0.9%
9524012.427 1
 
0.9%
6210377.33 1
 
0.9%
17881867.24 1
 
0.9%
6539128.477 1
 
0.9%
47615822.75 1
 
0.9%
61211809.64 1
 
0.9%
43815316.01 1
 
0.9%
Other values (105) 105
91.3%
ValueCountFrequency (%)
1561919.216 1
0.9%
1641946.793 1
0.9%
1735487.129 1
0.9%
2126451.322 1
0.9%
2590978.161 1
0.9%
2697735.314 1
0.9%
2710678.728 1
0.9%
3298755.712 1
0.9%
3427606.784 1
0.9%
3875574.832 1
0.9%
ValueCountFrequency (%)
124456371.6 1
0.9%
121716493.2 1
0.9%
106803303.3 1
0.9%
90649538.19 1
0.9%
88342152.45 1
0.9%
85299249.02 1
0.9%
63047938.1 1
0.9%
61211809.64 1
0.9%
58656364.62 1
0.9%
58419334.09 1
0.9%

Interactions

2024-05-04T07:38:07.378225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:38:06.599814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:38:07.612827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:38:07.034086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T07:38:14.389625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대권역중권역코드이용량(세제곱미터퍼_년)
대권역1.0000.9470.722
중권역코드0.9471.0000.148
이용량(세제곱미터퍼_년)0.7220.1481.000
2024-05-04T07:38:14.668501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중권역코드이용량(세제곱미터퍼_년)대권역
중권역코드1.000-0.0360.706
이용량(세제곱미터퍼_년)-0.0361.0000.343
대권역0.7060.3431.000

Missing values

2024-05-04T07:38:08.107424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T07:38:08.726696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대권역중권역코드중권역이용량(세제곱미터퍼_년)
0한강1001남한강상류17473069.3
1한강1002평창강22092515.63
2한강1003충주댐50091967.13
3한강1004달천50212013.03
4한강1005충주댐하류20359021.95
5한강1006섬강38803134.19
6한강1007남한강하류124456371.6
7한강1009평화의댐3875574.832
8한강1010춘천댐18430311.76
9한강1011인북천2126451.322
대권역중권역코드중권역이용량(세제곱미터퍼_년)
105탐진강5101탐진강13620653.88
106영산강남해5201진도6274500.6
107영산강남해5202영암방조제31668175.76
108영산강서해5301주진천28476598.04
109영산강서해5302와탄천43692315.63
110영산강서해5303신안군1735487.129
111제주도6001제주서해54854843.22
112제주도6002제주북해53427500.47
113제주도6003제주남해45693363.96
114제주도6004제주동해106803303.3