Overview

Dataset statistics

Number of variables5
Number of observations49
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory44.7 B

Variable types

Numeric2
Categorical1
Text2

Dataset

Description경상남도 내 정수시설 현황으로, 정수시설 시군명, 정수시설명, 정수장 시설용량(㎥/일)에 대한 정보를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3034244

Alerts

연번 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
취수장명 has unique valuesUnique
취수장 시설용량(㎥/일) has 1 (2.0%) zerosZeros

Reproduction

Analysis started2023-12-11 00:18:40.652803
Analysis finished2023-12-11 00:18:41.402446
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-11T09:18:41.475101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.4
Q113
median25
Q337
95-th percentile46.6
Maximum49
Range48
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.28869
Coefficient of variation (CV)0.57154761
Kurtosis-1.2
Mean25
Median Absolute Deviation (MAD)12
Skewness0
Sum1225
Variance204.16667
MonotonicityStrictly increasing
2023-12-11T09:18:41.618731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1 1
 
2.0%
38 1
 
2.0%
28 1
 
2.0%
29 1
 
2.0%
30 1
 
2.0%
31 1
 
2.0%
32 1
 
2.0%
33 1
 
2.0%
34 1
 
2.0%
35 1
 
2.0%
Other values (39) 39
79.6%
ValueCountFrequency (%)
1 1
2.0%
2 1
2.0%
3 1
2.0%
4 1
2.0%
5 1
2.0%
6 1
2.0%
7 1
2.0%
8 1
2.0%
9 1
2.0%
10 1
2.0%
ValueCountFrequency (%)
49 1
2.0%
48 1
2.0%
47 1
2.0%
46 1
2.0%
45 1
2.0%
44 1
2.0%
43 1
2.0%
42 1
2.0%
41 1
2.0%
40 1
2.0%

시군명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)34.7%
Missing0
Missing (%)0.0%
Memory size524.0 B
남해군
창원시
합천군
산청군
거창군
Other values (12)
21 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique6 ?
Unique (%)12.2%

Sample

1st row창원시
2nd row창원시
3rd row창원시
4th row창원시
5th row창원시

Common Values

ValueCountFrequency (%)
남해군 9
18.4%
창원시 6
12.2%
합천군 5
10.2%
산청군 4
8.2%
거창군 4
8.2%
의령군 4
8.2%
함안군 3
 
6.1%
하동군 2
 
4.1%
함양군 2
 
4.1%
창녕군 2
 
4.1%
Other values (7) 8
16.3%

Length

2023-12-11T09:18:41.768187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남해군 9
18.4%
창원시 6
12.2%
합천군 5
10.2%
산청군 4
8.2%
거창군 4
8.2%
의령군 4
8.2%
함안군 3
 
6.1%
창녕군 2
 
4.1%
양산시 2
 
4.1%
함양군 2
 
4.1%
Other values (7) 8
16.3%

취수장명
Text

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
2023-12-11T09:18:41.977945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.755102
Min length2

Characters and Unicode

Total characters135
Distinct characters77
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)100.0%

Sample

1st row대산(1만)
2nd row대산(12만)
3rd row북면
4th row창원칠서
5th row성주
ValueCountFrequency (%)
창원칠서 2
 
4.0%
대산(1만 1
 
2.0%
시천 1
 
2.0%
지족 1
 
2.0%
남면 1
 
2.0%
대곡 1
 
2.0%
창선 1
 
2.0%
항도 1
 
2.0%
하동 1
 
2.0%
청룡 1
 
2.0%
Other values (39) 39
78.0%
2023-12-11T09:18:42.324011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
3.7%
( 5
 
3.7%
) 5
 
3.7%
5
 
3.7%
5
 
3.7%
4
 
3.0%
3
 
2.2%
3
 
2.2%
3
 
2.2%
3
 
2.2%
Other values (67) 94
69.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 120
88.9%
Open Punctuation 5
 
3.7%
Close Punctuation 5
 
3.7%
Decimal Number 4
 
3.0%
Space Separator 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
4.2%
5
 
4.2%
5
 
4.2%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 83
69.2%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 120
88.9%
Common 15
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
4.2%
5
 
4.2%
5
 
4.2%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 83
69.2%
Common
ValueCountFrequency (%)
( 5
33.3%
) 5
33.3%
2 2
 
13.3%
1 2
 
13.3%
1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 120
88.9%
ASCII 15
 
11.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
4.2%
5
 
4.2%
5
 
4.2%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 83
69.2%
ASCII
ValueCountFrequency (%)
( 5
33.3%
) 5
33.3%
2 2
 
13.3%
1 2
 
13.3%
1
 
6.7%

취수장 시설용량(㎥/일)
Real number (ℝ)

ZEROS 

Distinct30
Distinct (%)61.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43032.857
Minimum0
Maximum500000
Zeros1
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-11T09:18:42.453174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile740
Q11000
median2600
Q311000
95-th percentile279000
Maximum500000
Range500000
Interquartile range (IQR)10000

Descriptive statistics

Standard deviation109840.46
Coefficient of variation (CV)2.5524789
Kurtosis9.0726084
Mean43032.857
Median Absolute Deviation (MAD)1800
Skewness3.0721865
Sum2108610
Variance1.2064927 × 1010
MonotonicityNot monotonic
2023-12-11T09:18:42.591548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1000 5
 
10.2%
2000 5
 
10.2%
800 5
 
10.2%
11000 2
 
4.1%
4000 2
 
4.1%
900 2
 
4.1%
5000 2
 
4.1%
700 2
 
4.1%
3000 2
 
4.1%
2200 2
 
4.1%
Other values (20) 20
40.8%
ValueCountFrequency (%)
0 1
 
2.0%
700 2
 
4.1%
800 5
10.2%
900 2
 
4.1%
1000 5
10.2%
1500 1
 
2.0%
2000 5
10.2%
2200 2
 
4.1%
2500 1
 
2.0%
2600 1
 
2.0%
ValueCountFrequency (%)
500000 1
2.0%
440000 1
2.0%
285000 1
2.0%
270000 1
2.0%
220000 1
2.0%
126000 1
2.0%
74200 1
2.0%
23810 1
2.0%
22000 1
2.0%
20000 1
2.0%
Distinct40
Distinct (%)81.6%
Missing0
Missing (%)0.0%
Memory size524.0 B
2023-12-11T09:18:42.797753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length6.877551
Min length1

Characters and Unicode

Total characters337
Distinct characters67
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)71.4%

Sample

1st row낙동강(강변여과수)
2nd row낙동강(강변여과수)
3rd row낙동강(강변여과수)
4th row낙동강
5th row성주수원지(호소수)
ValueCountFrequency (%)
낙동강(강변여과수 4
 
7.8%
낙동강 3
 
5.9%
남강 3
 
5.9%
가야천 2
 
3.9%
황강 2
 
3.9%
낙동강(배분량 2
 
3.9%
항도저수지(호소수 1
 
2.0%
섬진강 1
 
2.0%
청룡저수지(호소수 1
 
2.0%
지족저수지(호소수 1
 
2.0%
Other values (31) 31
60.8%
2023-12-11T09:18:43.427039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
46
13.6%
) 27
 
8.0%
( 27
 
8.0%
24
 
7.1%
24
 
7.1%
23
 
6.8%
21
 
6.2%
17
 
5.0%
14
 
4.2%
11
 
3.3%
Other values (57) 103
30.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 269
79.8%
Close Punctuation 27
 
8.0%
Open Punctuation 27
 
8.0%
Decimal Number 10
 
3.0%
Space Separator 2
 
0.6%
Other Punctuation 1
 
0.3%
Dash Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
17.1%
24
 
8.9%
24
 
8.9%
23
 
8.6%
21
 
7.8%
17
 
6.3%
14
 
5.2%
11
 
4.1%
9
 
3.3%
4
 
1.5%
Other values (48) 76
28.3%
Decimal Number
ValueCountFrequency (%)
0 6
60.0%
5 2
 
20.0%
2 1
 
10.0%
6 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 269
79.8%
Common 68
 
20.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
17.1%
24
 
8.9%
24
 
8.9%
23
 
8.6%
21
 
7.8%
17
 
6.3%
14
 
5.2%
11
 
4.1%
9
 
3.3%
4
 
1.5%
Other values (48) 76
28.3%
Common
ValueCountFrequency (%)
) 27
39.7%
( 27
39.7%
0 6
 
8.8%
5 2
 
2.9%
2
 
2.9%
, 1
 
1.5%
2 1
 
1.5%
6 1
 
1.5%
- 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 269
79.8%
ASCII 68
 
20.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
46
17.1%
24
 
8.9%
24
 
8.9%
23
 
8.6%
21
 
7.8%
17
 
6.3%
14
 
5.2%
11
 
4.1%
9
 
3.3%
4
 
1.5%
Other values (48) 76
28.3%
ASCII
ValueCountFrequency (%)
) 27
39.7%
( 27
39.7%
0 6
 
8.8%
5 2
 
2.9%
2
 
2.9%
, 1
 
1.5%
2 1
 
1.5%
6 1
 
1.5%
- 1
 
1.5%

Interactions

2023-12-11T09:18:41.049015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:40.887767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:41.137666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:40.974626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:18:43.524486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군명취수장명취수장 시설용량(㎥/일)취수원
연번1.0000.9541.0000.3880.979
시군명0.9541.0001.0000.7630.972
취수장명1.0001.0001.0001.0001.000
취수장 시설용량(㎥/일)0.3880.7631.0001.0000.000
취수원0.9790.9721.0000.0001.000
2023-12-11T09:18:43.623387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번취수장 시설용량(㎥/일)시군명
연번1.000-0.4840.705
취수장 시설용량(㎥/일)-0.4841.0000.410
시군명0.7050.4101.000

Missing values

2023-12-11T09:18:41.250247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:18:41.365748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군명취수장명취수장 시설용량(㎥/일)취수원
01창원시대산(1만)11000낙동강(강변여과수)
12창원시대산(12만)126000낙동강(강변여과수)
23창원시북면11000낙동강(강변여과수)
34창원시창원칠서440000낙동강
45창원시성주23810성주수원지(호소수)
56창원시본포(수자원공사)285000낙동강(배분량 62,000)
67진주시진양호220000남강댐(호소수)
78통영시욕지800욕지수원지(호소수)
89사천시곤명2200덕천강
910김해시창암270000낙동강
연번시군명취수장명취수장 시설용량(㎥/일)취수원
3940함양군안의2000지우천
4041거창군위천800상천저수지(호소수)
4142거창군거창20000황강천
4243거창군가조3300가천천
4344거창군웅양800계수천
4445합천군합천10000황강
4546합천군적중2000황강
4647합천군해인사700가야천
4748합천군삼가1000양천강
4849합천군가야2200가야천