Overview

Dataset statistics

Number of variables6
Number of observations2150
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory105.1 KiB
Average record size in memory50.1 B

Variable types

Categorical4
Text1
Numeric1

Dataset

Description전북특별자치도 공공하수처리시설 시군 및 시설별, 시설용량, 처리량, 연계처리량(물리적, 생물학적, 고도, 분뇨, 축산, 침출수, 기타)
Author전북특별자치도
URLhttps://www.data.go.kr/data/15117488/fileData.do

Alerts

처리현황 is highly overall correlated with 처리종류High correlation
처리종류 is highly overall correlated with 처리현황High correlation
처리량 has 1609 (74.8%) zerosZeros

Reproduction

Analysis started2024-03-14 21:05:50.133203
Analysis finished2024-03-14 21:05:51.631139
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2021
550 
2019
549 
2020
548 
2018
503 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2021 550
25.6%
2019 549
25.5%
2020 548
25.5%
2018 503
23.4%

Length

2024-03-15T06:05:51.836188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T06:05:52.183825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 550
25.6%
2019 549
25.5%
2020 548
25.5%
2018 503
23.4%

시군
Categorical

Distinct14
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
부안군
352 
군산시
240 
무주군
200 
익산시
199 
고창군
190 
Other values (9)
969 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전주시
2nd row전주시
3rd row전주시
4th row전주시
5th row전주시

Common Values

ValueCountFrequency (%)
부안군 352
16.4%
군산시 240
11.2%
무주군 200
9.3%
익산시 199
9.3%
고창군 190
8.8%
장수군 173
8.0%
김제시 160
7.4%
완주군 159
7.4%
남원시 120
 
5.6%
정읍시 80
 
3.7%
Other values (4) 277
12.9%

Length

2024-03-15T06:05:52.582319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
부안군 352
16.4%
군산시 240
11.2%
무주군 200
9.3%
익산시 199
9.3%
고창군 190
8.8%
장수군 173
8.0%
김제시 160
7.4%
완주군 159
7.4%
남원시 120
 
5.6%
정읍시 80
 
3.7%
Other values (4) 277
12.9%
Distinct55
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2024-03-15T06:05:53.604921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.2046512
Min length2

Characters and Unicode

Total characters4740
Distinct characters76
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전주
2nd row전주
3rd row전주
4th row전주
5th row전주
ValueCountFrequency (%)
용담 40
 
1.9%
진서 40
 
1.9%
장계 40
 
1.9%
소양 40
 
1.9%
금마 40
 
1.9%
수질복원센터 40
 
1.9%
무주 40
 
1.9%
무풍 40
 
1.9%
구이 40
 
1.9%
안성 40
 
1.9%
Other values (45) 1750
81.4%
2024-03-15T06:05:54.890353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
392
 
8.3%
224
 
4.7%
160
 
3.4%
155
 
3.3%
144
 
3.0%
120
 
2.5%
120
 
2.5%
120
 
2.5%
120
 
2.5%
110
 
2.3%
Other values (66) 3075
64.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4700
99.2%
Other Punctuation 40
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
392
 
8.3%
224
 
4.8%
160
 
3.4%
155
 
3.3%
144
 
3.1%
120
 
2.6%
120
 
2.6%
120
 
2.6%
120
 
2.6%
110
 
2.3%
Other values (65) 3035
64.6%
Other Punctuation
ValueCountFrequency (%)
· 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4700
99.2%
Common 40
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
392
 
8.3%
224
 
4.8%
160
 
3.4%
155
 
3.3%
144
 
3.1%
120
 
2.6%
120
 
2.6%
120
 
2.6%
120
 
2.6%
110
 
2.3%
Other values (65) 3035
64.6%
Common
ValueCountFrequency (%)
· 40
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4700
99.2%
None 40
 
0.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
392
 
8.3%
224
 
4.8%
160
 
3.4%
155
 
3.3%
144
 
3.1%
120
 
2.6%
120
 
2.6%
120
 
2.6%
120
 
2.6%
110
 
2.3%
Other values (65) 3035
64.6%
None
ValueCountFrequency (%)
· 40
100.0%

처리현황
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
연계처리량
857 
처리량
648 
시설용량
645 

Length

Max length6
Median length5
Mean length5.0972093
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시설용량
2nd row시설용량
3rd row시설용량
4th row처리량
5th row처리량

Common Values

ValueCountFrequency (%)
연계처리량 857
39.9%
처리량 648
30.1%
시설용량 645
30.0%

Length

2024-03-15T06:05:55.182459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T06:05:55.439748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
연계처리량 857
39.9%
처리량 648
30.1%
시설용량 645
30.0%

처리종류
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
고도
433 
물리적
430 
생물학적
430 
축산
215 
침출수
215 
Other values (2)
427 

Length

Max length4
Median length3.5
Mean length2.7
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row물리적
2nd row생물학적
3rd row고도
4th row물리적
5th row생물학적

Common Values

ValueCountFrequency (%)
고도 433
20.1%
물리적 430
20.0%
생물학적 430
20.0%
축산 215
10.0%
침출수 215
10.0%
분뇨 214
10.0%
기타 213
9.9%

Length

2024-03-15T06:05:55.670617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T06:05:56.015085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고도 433
20.1%
물리적 430
20.0%
생물학적 430
20.0%
축산 215
10.0%
침출수 215
10.0%
분뇨 214
10.0%
기타 213
9.9%

처리량
Real number (ℝ)

ZEROS 

Distinct321
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3460.9377
Minimum0
Maximum403000
Zeros1609
Zeros (%)74.8%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2024-03-15T06:05:56.271181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q35.75
95-th percentile6000
Maximum403000
Range403000
Interquartile range (IQR)5.75

Descriptive statistics

Standard deviation25966.983
Coefficient of variation (CV)7.502875
Kurtosis162.80392
Mean3460.9377
Median Absolute Deviation (MAD)0
Skewness12.10086
Sum7441016.1
Variance6.7428421 × 108
MonotonicityNot monotonic
2024-03-15T06:05:56.623694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1609
74.8%
500.0 19
 
0.9%
600.0 16
 
0.7%
800.0 12
 
0.6%
16000.0 12
 
0.6%
550.0 12
 
0.6%
700.0 11
 
0.5%
1600.0 8
 
0.4%
1700.0 8
 
0.4%
2000.0 8
 
0.4%
Other values (311) 435
 
20.2%
ValueCountFrequency (%)
0.0 1609
74.8%
2.0 1
 
< 0.1%
3.0 1
 
< 0.1%
5.0 1
 
< 0.1%
6.0 1
 
< 0.1%
9.0 3
 
0.1%
10.0 1
 
< 0.1%
11.0 1
 
< 0.1%
13.0 1
 
< 0.1%
13.5 1
 
< 0.1%
ValueCountFrequency (%)
403000.0 4
0.2%
352935.0 1
 
< 0.1%
346802.0 1
 
< 0.1%
334762.0 1
 
< 0.1%
329211.0 1
 
< 0.1%
200000.0 4
0.2%
127769.0 1
 
< 0.1%
126868.0 1
 
< 0.1%
124623.0 1
 
< 0.1%
122183.0 1
 
< 0.1%

Interactions

2024-03-15T06:05:50.548522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T06:05:56.895908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도시군시군 및 시설별처리현황처리종류처리량
연도1.0000.0000.0000.0000.0000.000
시군0.0001.0001.0000.0000.0000.444
시군 및 시설별0.0001.0001.0000.0000.0000.646
처리현황0.0000.0000.0001.0000.7680.137
처리종류0.0000.0000.0000.7681.0000.253
처리량0.0000.4440.6460.1370.2531.000
2024-03-15T06:05:57.118238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군연도처리종류처리현황
시군1.0000.0000.0000.000
연도0.0001.0000.0000.000
처리종류0.0000.0001.0000.705
처리현황0.0000.0000.7051.000
2024-03-15T06:05:57.278344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리량연도시군처리현황처리종류
처리량1.0000.0000.1790.0920.090
연도0.0001.0000.0000.0000.000
시군0.1790.0001.0000.0000.000
처리현황0.0920.0000.0001.0000.705
처리종류0.0900.0000.0000.7051.000

Missing values

2024-03-15T06:05:51.144071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T06:05:51.487757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도시군시군 및 시설별처리현황처리종류처리량
02018전주시전주시설용량물리적0.0
12018전주시전주시설용량생물학적0.0
22018전주시전주시설용량고도403000.0
32018전주시전주처리량물리적0.0
42018전주시전주처리량생물학적0.0
52018전주시전주처리량고도329211.0
62018전주시전주연계처리량분뇨0.0
72018전주시전주연계처리량축산0.0
82018전주시전주연계처리량침출수939.0
92018전주시전주연계처리량기타0.0
연도시군시군 및 시설별처리현황처리종류처리량
21402021부안군하서시설용량물리적0.0
21412021부안군하서시설용량생물학적0.0
21422021부안군하서시설용량고도500.0
21432021부안군하서처리량물리적0.0
21442021부안군하서처리량생물학적0.0
21452021부안군하서처리량고도284.0
21462021부안군하서연계처리량분뇨0.0
21472021부안군하서연계처리량축산0.0
21482021부안군하서연계처리량침출수0.0
21492021부안군하서연계처리량기타0.0