Overview

Dataset statistics

Number of variables5
Number of observations1246
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory52.5 KiB
Average record size in memory43.1 B

Variable types

Categorical2
Text1
Numeric2

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 산업폐수방류량(m³/일), 업소수(개)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110175/fileData.do

Alerts

산업폐수방류량(세제곱미터 퍼 일) is highly overall correlated with 업소수(개)High correlation
업소수(개) is highly overall correlated with 산업폐수방류량(세제곱미터 퍼 일)High correlation

Reproduction

Analysis started2023-12-12 21:55:45.202657
Analysis finished2023-12-12 21:55:46.108419
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2015
250 
2016
249 
2017
249 
2018
249 
2019
249 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2015 250
20.1%
2016 249
20.0%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%

Length

2023-12-13T06:55:46.157926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:55:46.249631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015 250
20.1%
2016 249
20.0%
2017 249
20.0%
2018 249
20.0%
2019 249
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
경기도
212 
서울특별시
125 
경상북도
120 
전라남도
110 
경상남도
110 
Other values (11)
569 

Length

Max length7
Median length5
Mean length4.0786517
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 69
 
5.5%
Other values (6) 175
14.0%

Length

2023-12-13T06:55:46.352501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 212
17.0%
서울특별시 125
10.0%
경상북도 120
9.6%
전라남도 110
8.8%
경상남도 110
8.8%
강원도 90
7.2%
부산광역시 80
 
6.4%
충청남도 80
 
6.4%
전라북도 75
 
6.0%
충청북도 69
 
5.5%
Other values (6) 175
14.0%
Distinct233
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Memory size9.9 KiB
2023-12-13T06:55:46.630670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3515249
Min length2

Characters and Unicode

Total characters4176
Distinct characters142
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.4%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 30
 
2.4%
동구 30
 
2.4%
서구 25
 
2.0%
북구 20
 
1.6%
남구 20
 
1.6%
고성군 10
 
0.8%
강서구 10
 
0.8%
남원시 5
 
0.4%
장수군 5
 
0.4%
무주군 5
 
0.4%
Other values (223) 1086
87.2%
2023-12-13T06:55:47.065693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
531
 
12.7%
492
 
11.8%
429
 
10.3%
118
 
2.8%
117
 
2.8%
116
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (132) 1968
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4176
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
531
 
12.7%
492
 
11.8%
429
 
10.3%
118
 
2.8%
117
 
2.8%
116
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (132) 1968
47.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4176
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
531
 
12.7%
492
 
11.8%
429
 
10.3%
118
 
2.8%
117
 
2.8%
116
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (132) 1968
47.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4176
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
531
 
12.7%
492
 
11.8%
429
 
10.3%
118
 
2.8%
117
 
2.8%
116
 
2.8%
110
 
2.6%
105
 
2.5%
100
 
2.4%
90
 
2.2%
Other values (132) 1968
47.1%

산업폐수방류량(세제곱미터 퍼 일)
Real number (ℝ)

HIGH CORRELATION 

Distinct1165
Distinct (%)93.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14843.59
Minimum1
Maximum296873
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T06:55:47.204394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile163.25
Q11128
median3183
Q310380
95-th percentile87369
Maximum296873
Range296872
Interquartile range (IQR)9252

Descriptive statistics

Standard deviation34008.495
Coefficient of variation (CV)2.2911233
Kurtosis18.534443
Mean14843.59
Median Absolute Deviation (MAD)2618
Skewness4.0410342
Sum18495113
Variance1.1565778 × 109
MonotonicityNot monotonic
2023-12-13T06:55:47.349426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
726 4
 
0.3%
60 4
 
0.3%
672 3
 
0.2%
741 3
 
0.2%
106 3
 
0.2%
55 3
 
0.2%
2603 3
 
0.2%
568 3
 
0.2%
704 3
 
0.2%
1128 3
 
0.2%
Other values (1155) 1214
97.4%
ValueCountFrequency (%)
1 2
0.2%
3 1
0.1%
11 1
0.1%
17 1
0.1%
20 1
0.1%
22 1
0.1%
27 1
0.1%
34 1
0.1%
43 2
0.2%
47 1
0.1%
ValueCountFrequency (%)
296873 1
0.1%
252009 1
0.1%
250205 1
0.1%
236148 1
0.1%
229075 1
0.1%
227821 1
0.1%
198766 1
0.1%
193075 1
0.1%
192700 1
0.1%
188075 1
0.1%

업소수(개)
Real number (ℝ)

HIGH CORRELATION 

Distinct467
Distinct (%)37.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean210.2496
Minimum4
Maximum2695
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.1 KiB
2023-12-13T06:55:47.477709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile27.25
Q164.25
median119.5
Q3246.5
95-th percentile673.25
Maximum2695
Range2691
Interquartile range (IQR)182.25

Descriptive statistics

Standard deviation258.64162
Coefficient of variation (CV)1.2301646
Kurtosis22.371774
Mean210.2496
Median Absolute Deviation (MAD)66.5
Skewness3.7276693
Sum261971
Variance66895.485
MonotonicityNot monotonic
2023-12-13T06:55:47.626505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57 18
 
1.4%
48 12
 
1.0%
78 12
 
1.0%
61 11
 
0.9%
56 11
 
0.9%
64 11
 
0.9%
53 10
 
0.8%
79 10
 
0.8%
63 10
 
0.8%
69 10
 
0.8%
Other values (457) 1131
90.8%
ValueCountFrequency (%)
4 2
0.2%
6 1
 
0.1%
7 3
0.2%
8 3
0.2%
9 1
 
0.1%
11 1
 
0.1%
12 1
 
0.1%
13 3
0.2%
14 3
0.2%
16 1
 
0.1%
ValueCountFrequency (%)
2695 1
0.1%
2590 1
0.1%
2527 1
0.1%
1929 1
0.1%
1870 1
0.1%
1446 1
0.1%
1365 1
0.1%
1237 1
0.1%
1205 1
0.1%
1176 1
0.1%

Interactions

2023-12-13T06:55:45.569713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:55:45.422112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:55:45.646956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:55:45.498581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:55:47.723249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명산업폐수방류량(세제곱미터 퍼 일)업소수(개)
통계연도1.0000.0000.0000.000
시도명0.0001.0000.4640.287
산업폐수방류량(세제곱미터 퍼 일)0.0000.4641.0000.510
업소수(개)0.0000.2870.5101.000
2023-12-13T06:55:47.825789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명
통계연도1.0000.000
시도명0.0001.000
2023-12-13T06:55:47.917390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
산업폐수방류량(세제곱미터 퍼 일)업소수(개)통계연도시도명
산업폐수방류량(세제곱미터 퍼 일)1.0000.7490.0000.201
업소수(개)0.7491.0000.0000.122
통계연도0.0000.0001.0000.000
시도명0.2010.1220.0001.000

Missing values

2023-12-13T06:55:45.736963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:55:46.076788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명산업폐수방류량(세제곱미터 퍼 일)업소수(개)
02015서울특별시종로구626128
12015서울특별시중구1214551
22015서울특별시용산구78163
32015서울특별시성동구5306332
42015서울특별시광진구854356
52015서울특별시동대문구125566
62015서울특별시중랑구92786
72015서울특별시성북구128770
82015서울특별시강북구68657
92015서울특별시도봉구66287
통계연도시도명시군구명산업폐수방류량(세제곱미터 퍼 일)업소수(개)
12362019경상남도함양군50042
12372019경상남도거창군298174
12382019경상남도합천군518279
12392019경상남도창원시성산구25839270
12402019경상남도창원시의창구7438236
12412019경상남도창원시진해구961101
12422019경상남도창원시마산합포구333140
12432019경상남도창원시마산회원구6104261
12442019제주특별자치도제주시5153434
12452019제주특별자치도서귀포시4697143