Overview

Dataset statistics

Number of variables4
Number of observations51
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory36.6 B

Variable types

Numeric2
Categorical1
Text1

Dataset

Description경상남도 내 정수시설 현황으로, 정수시설 시군명, 정수시설명, 정수장 시설용량(㎥/일)에 대한 정보를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3034244

Alerts

연번 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
정수장명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:18:45.224508
Analysis finished2023-12-11 00:18:45.916889
Duration0.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct51
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26
Minimum1
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size591.0 B
2023-12-11T09:18:45.987346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q113.5
median26
Q338.5
95-th percentile48.5
Maximum51
Range50
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.866069
Coefficient of variation (CV)0.57177187
Kurtosis-1.2
Mean26
Median Absolute Deviation (MAD)13
Skewness0
Sum1326
Variance221
MonotonicityStrictly increasing
2023-12-11T09:18:46.146188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
2.0%
2 1
 
2.0%
29 1
 
2.0%
30 1
 
2.0%
31 1
 
2.0%
32 1
 
2.0%
33 1
 
2.0%
34 1
 
2.0%
35 1
 
2.0%
36 1
 
2.0%
Other values (41) 41
80.4%
ValueCountFrequency (%)
1 1
2.0%
2 1
2.0%
3 1
2.0%
4 1
2.0%
5 1
2.0%
6 1
2.0%
7 1
2.0%
8 1
2.0%
9 1
2.0%
10 1
2.0%
ValueCountFrequency (%)
51 1
2.0%
50 1
2.0%
49 1
2.0%
48 1
2.0%
47 1
2.0%
46 1
2.0%
45 1
2.0%
44 1
2.0%
43 1
2.0%
42 1
2.0%

시군명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size540.0 B
남해군
11 
의령군
창원시
거창군
합천군
Other values (12)
22 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique4 ?
Unique (%)7.8%

Sample

1st row창원시
2nd row창원시
3rd row창원시
4th row창원시
5th row창원시

Common Values

ValueCountFrequency (%)
남해군 11
21.6%
의령군 5
9.8%
창원시 5
9.8%
거창군 4
 
7.8%
합천군 4
 
7.8%
양산시 3
 
5.9%
산청군 3
 
5.9%
창녕군 2
 
3.9%
함양군 2
 
3.9%
하동군 2
 
3.9%
Other values (7) 10
19.6%

Length

2023-12-11T09:18:46.288226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남해군 11
21.6%
창원시 5
9.8%
의령군 5
9.8%
거창군 4
 
7.8%
합천군 4
 
7.8%
양산시 3
 
5.9%
산청군 3
 
5.9%
진주시 2
 
3.9%
김해시 2
 
3.9%
함안군 2
 
3.9%
Other values (7) 10
19.6%

정수장명
Text

UNIQUE 

Distinct51
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size540.0 B
2023-12-11T09:18:46.514470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.3921569
Min length2

Characters and Unicode

Total characters122
Distinct characters74
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)100.0%

Sample

1st row대산(1만)
2nd row대산(12만)
3rd row북면
4th row창원칠서
5th row석동
ValueCountFrequency (%)
대산(1만 1
 
2.0%
봉성 1
 
2.0%
상주 1
 
2.0%
지족 1
 
2.0%
남면 1
 
2.0%
대곡 1
 
2.0%
창선 1
 
2.0%
항도 1
 
2.0%
선원 1
 
2.0%
노구 1
 
2.0%
Other values (41) 41
80.4%
2023-12-11T09:18:46.885451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
4.1%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
1 3
 
2.5%
Other values (64) 87
71.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 113
92.6%
Decimal Number 5
 
4.1%
Close Punctuation 2
 
1.6%
Open Punctuation 2
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
4.4%
4
 
3.5%
4
 
3.5%
4
 
3.5%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (60) 78
69.0%
Decimal Number
ValueCountFrequency (%)
1 3
60.0%
2 2
40.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 113
92.6%
Common 9
 
7.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
4.4%
4
 
3.5%
4
 
3.5%
4
 
3.5%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (60) 78
69.0%
Common
ValueCountFrequency (%)
1 3
33.3%
2 2
22.2%
) 2
22.2%
( 2
22.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 113
92.6%
ASCII 9
 
7.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
4.4%
4
 
3.5%
4
 
3.5%
4
 
3.5%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (60) 78
69.0%
ASCII
ValueCountFrequency (%)
1 3
33.3%
2 2
22.2%
) 2
22.2%
( 2
22.2%

정수장 시설용량
Real number (ℝ)

Distinct32
Distinct (%)62.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27509.02
Minimum700
Maximum400000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size591.0 B
2023-12-11T09:18:47.031386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum700
5-th percentile800
Q11000
median3000
Q313000
95-th percentile130000
Maximum400000
Range399300
Interquartile range (IQR)12000

Descriptive statistics

Standard deviation65710.904
Coefficient of variation (CV)2.3887039
Kurtosis20.907882
Mean27509.02
Median Absolute Deviation (MAD)2200
Skewness4.1693433
Sum1402960
Variance4.3179228 × 109
MonotonicityNot monotonic
2023-12-11T09:18:47.172066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
800 5
 
9.8%
3000 4
 
7.8%
2000 4
 
7.8%
1000 3
 
5.9%
10000 3
 
5.9%
4000 2
 
3.9%
2500 2
 
3.9%
950 2
 
3.9%
700 2
 
3.9%
20000 2
 
3.9%
Other values (22) 22
43.1%
ValueCountFrequency (%)
700 2
 
3.9%
800 5
9.8%
900 1
 
2.0%
950 2
 
3.9%
960 1
 
2.0%
1000 3
5.9%
1500 1
 
2.0%
2000 4
7.8%
2200 1
 
2.0%
2500 2
 
3.9%
ValueCountFrequency (%)
400000 1
2.0%
165000 1
2.0%
140000 1
2.0%
120000 1
2.0%
105000 1
2.0%
100000 1
2.0%
60000 1
2.0%
55000 1
2.0%
50000 1
2.0%
38000 1
2.0%

Interactions

2023-12-11T09:18:45.573351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:45.367153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:45.665233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:18:45.460606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:18:47.257552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군명정수장명정수장 시설용량
연번1.0000.9581.0000.296
시군명0.9581.0001.0000.687
정수장명1.0001.0001.0001.000
정수장 시설용량0.2960.6871.0001.000
2023-12-11T09:18:47.345738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번정수장 시설용량시군명
연번1.000-0.4640.769
정수장 시설용량-0.4641.0000.365
시군명0.7690.3651.000

Missing values

2023-12-11T09:18:45.778267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:18:45.871213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군명정수장명정수장 시설용량
01창원시대산(1만)10000
12창원시대산(12만)120000
23창원시북면10000
34창원시창원칠서400000
45창원시석동100000
56진주시진주160000
67진주시진주2140000
78통영시욕지800
89사천시곤명2000
910김해시삼계165000
연번시군명정수장명정수장 시설용량
4142함양군서상3700
4243함양군함양9000
4344거창군위천800
4445거창군거창20000
4546거창군가조3300
4647거창군웅양800
4748합천군합천10000
4849합천군적중3500
4950합천군해인사700
5051합천군가야2200