Overview

Dataset statistics

Number of variables15
Number of observations1617
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory194.4 KiB
Average record size in memory123.1 B

Variable types

Numeric2
Categorical9
Text3
Boolean1

Dataset

Description2021년 ("20년 기준) 특정수질유해물질 배출량 조사 대상 사업장 정보로 해당 사업장명, 관할기관, 사업장소재지, 폐수처리형태 등의 정보가 제공됩니다
Author환경부 국립환경과학원
URLhttps://www.data.go.kr/data/15123878/fileData.do

Alerts

기준년 has constant value ""Constant
공개여부 has constant value ""Constant
기타처리 is highly overall correlated with 순번 and 8 other fieldsHigh correlation
원폐수를 공동방지시설로 유입처리 is highly overall correlated with 순번 and 7 other fieldsHigh correlation
관할기관명 is highly overall correlated with 순번 and 7 other fieldsHigh correlation
개별처리 후 직접방류 is highly overall correlated with 순번 and 8 other fieldsHigh correlation
폐수처리장코드 is highly overall correlated with 개별처리 후 직접방류 and 5 other fieldsHigh correlation
위탁처리 is highly overall correlated with 순번 and 8 other fieldsHigh correlation
개별처리 후 공공하폐수처리시설로 유입처리 is highly overall correlated with 순번 and 8 other fieldsHigh correlation
원폐수를 공공하폐수처리시설로 유입처리 is highly overall correlated with 순번 and 7 other fieldsHigh correlation
순번 is highly overall correlated with 관할기관코드 and 7 other fieldsHigh correlation
관할기관코드 is highly overall correlated with 순번 and 7 other fieldsHigh correlation
원폐수를 공공하폐수처리시설로 유입처리 is highly imbalanced (62.1%)Imbalance
기타처리 is highly imbalanced (69.8%)Imbalance
순번 has unique valuesUnique
사업장명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:57:45.899979
Analysis finished2023-12-12 07:57:48.237556
Duration2.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1617
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean809
Minimum1
Maximum1617
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-12-12T16:57:48.342498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile81.8
Q1405
median809
Q31213
95-th percentile1536.2
Maximum1617
Range1616
Interquartile range (IQR)808

Descriptive statistics

Standard deviation466.93201
Coefficient of variation (CV)0.57717183
Kurtosis-1.2
Mean809
Median Absolute Deviation (MAD)404
Skewness0
Sum1308153
Variance218025.5
MonotonicityStrictly increasing
2023-12-12T16:57:48.551696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1076 1
 
0.1%
1086 1
 
0.1%
1085 1
 
0.1%
1084 1
 
0.1%
1083 1
 
0.1%
1082 1
 
0.1%
1081 1
 
0.1%
1080 1
 
0.1%
1079 1
 
0.1%
Other values (1607) 1607
99.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1617 1
0.1%
1616 1
0.1%
1615 1
0.1%
1614 1
0.1%
1613 1
0.1%
1612 1
0.1%
1611 1
0.1%
1610 1
0.1%
1609 1
0.1%
1608 1
0.1%

기준년
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2020
1617 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 1617
100.0%

Length

2023-12-12T16:57:48.756809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:48.890346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 1617
100.0%

안내
Text

Distinct1615
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-12T16:57:49.134986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters51744
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1613 ?
Unique (%)99.8%

Sample

1st rowC1B68F97595572C3E05367020202C5BA
2nd rowC1B47266239D6E5EE0536702020296B4
3rd rowC2F6769E18976C86E05367020202F83E
4th rowC201255DDC042AE3E053670202025E17
5th rowC17AD63328A364AAE053670202022CBD
ValueCountFrequency (%)
c1f33acff4ec2ad6e05367020202257c 2
 
0.1%
c2d22c2ae23d6578e05367020202e900 2
 
0.1%
c30aac84774c75bce05367020202e308 1
 
0.1%
c1b40571d7de6e5ce05367020202d0ba 1
 
0.1%
c1f4373bbbf21ad8e05367020202531e 1
 
0.1%
c1f33acff4f02ad6e05367020202257c 1
 
0.1%
c1b5348b6d52633fe053670202020dc3 1
 
0.1%
c1f43e8adbab1ad6e053670202020e83 1
 
0.1%
c1b7366210ae39a4e053670202029084 1
 
0.1%
c2bab5407f927a15e053670202024fbd 1
 
0.1%
Other values (1605) 1605
99.3%
2023-12-12T16:57:49.531474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8510
16.4%
2 7552
14.6%
3 4111
7.9%
7 3663
 
7.1%
5 3577
 
6.9%
6 3439
 
6.6%
C 3279
 
6.3%
E 3169
 
6.1%
1 2309
 
4.5%
4 2162
 
4.2%
Other values (6) 9973
19.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 38506
74.4%
Uppercase Letter 13238
 
25.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8510
22.1%
2 7552
19.6%
3 4111
10.7%
7 3663
9.5%
5 3577
9.3%
6 3439
8.9%
1 2309
 
6.0%
4 2162
 
5.6%
8 1655
 
4.3%
9 1528
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
C 3279
24.8%
E 3169
23.9%
A 1788
13.5%
F 1774
13.4%
D 1616
12.2%
B 1612
12.2%

Most occurring scripts

ValueCountFrequency (%)
Common 38506
74.4%
Latin 13238
 
25.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8510
22.1%
2 7552
19.6%
3 4111
10.7%
7 3663
9.5%
5 3577
9.3%
6 3439
8.9%
1 2309
 
6.0%
4 2162
 
5.6%
8 1655
 
4.3%
9 1528
 
4.0%
Latin
ValueCountFrequency (%)
C 3279
24.8%
E 3169
23.9%
A 1788
13.5%
F 1774
13.4%
D 1616
12.2%
B 1612
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51744
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8510
16.4%
2 7552
14.6%
3 4111
7.9%
7 3663
 
7.1%
5 3577
 
6.9%
6 3439
 
6.6%
C 3279
 
6.3%
E 3169
 
6.1%
1 2309
 
4.5%
4 2162
 
4.2%
Other values (6) 9973
19.3%

관할기관명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
한강유역환경청
491 
대구지방환경청
307 
낙동강유역환경청
277 
금강유역환경청
236 
영산강유역환경청
127 
Other values (2)
179 

Length

Max length8
Median length7
Mean length7.2498454
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row금강유역환경청
2nd row금강유역환경청
3rd row금강유역환경청
4th row금강유역환경청
5th row금강유역환경청

Common Values

ValueCountFrequency (%)
한강유역환경청 491
30.4%
대구지방환경청 307
19.0%
낙동강유역환경청 277
17.1%
금강유역환경청 236
14.6%
영산강유역환경청 127
 
7.9%
전북지방환경청 120
 
7.4%
원주지방환경청 59
 
3.6%

Length

2023-12-12T16:57:49.708684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:49.848743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한강유역환경청 491
30.4%
대구지방환경청 307
19.0%
낙동강유역환경청 277
17.1%
금강유역환경청 236
14.6%
영산강유역환경청 127
 
7.9%
전북지방환경청 120
 
7.4%
원주지방환경청 59
 
3.6%

관할기관코드
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6214.1373
Minimum6100
Maximum6410
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-12-12T16:57:49.981009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6100
5-th percentile6100
Q16100
median6200
Q36300
95-th percentile6410
Maximum6410
Range310
Interquartile range (IQR)200

Descriptive statistics

Standard deviation105.03247
Coefficient of variation (CV)0.016902181
Kurtosis-0.80459796
Mean6214.1373
Median Absolute Deviation (MAD)100
Skewness0.57158904
Sum10048260
Variance11031.82
MonotonicityNot monotonic
2023-12-12T16:57:50.096557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
6100 491
30.4%
6210 307
19.0%
6200 277
17.1%
6300 236
14.6%
6400 127
 
7.9%
6410 120
 
7.4%
6110 59
 
3.6%
ValueCountFrequency (%)
6100 491
30.4%
6110 59
 
3.6%
6200 277
17.1%
6210 307
19.0%
6300 236
14.6%
6400 127
 
7.9%
6410 120
 
7.4%
ValueCountFrequency (%)
6410 120
 
7.4%
6400 127
 
7.9%
6300 236
14.6%
6210 307
19.0%
6200 277
17.1%
6110 59
 
3.6%
6100 491
30.4%

사업장명
Text

UNIQUE 

Distinct1617
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-12T16:57:50.333309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length25
Mean length9.7254174
Min length2

Characters and Unicode

Total characters15726
Distinct characters518
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1617 ?
Unique (%)100.0%

Sample

1st row(주)LG화학 대산공장
2nd row(주)LG화학 오창1공장
3rd row(주)경보제약
4th row(주)그린이에스
5th row(주)기산화학
ValueCountFrequency (%)
주식회사 44
 
2.1%
한국수자원공사 28
 
1.3%
한국지역난방공사 11
 
0.5%
울산공장 11
 
0.5%
9
 
0.4%
상수도사업본부 7
 
0.3%
제2공장 7
 
0.3%
2공장 6
 
0.3%
제1공장 5
 
0.2%
온산공장 5
 
0.2%
Other values (1810) 1965
93.7%
2023-12-12T16:57:50.728824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
998
 
6.3%
( 853
 
5.4%
) 853
 
5.4%
545
 
3.5%
505
 
3.2%
501
 
3.2%
402
 
2.6%
306
 
1.9%
305
 
1.9%
299
 
1.9%
Other values (508) 10159
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12599
80.1%
Open Punctuation 858
 
5.5%
Close Punctuation 858
 
5.5%
Space Separator 505
 
3.2%
Other Symbol 402
 
2.6%
Uppercase Letter 249
 
1.6%
Decimal Number 156
 
1.0%
Connector Punctuation 39
 
0.2%
Lowercase Letter 28
 
0.2%
Other Punctuation 20
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
998
 
7.9%
545
 
4.3%
501
 
4.0%
306
 
2.4%
305
 
2.4%
299
 
2.4%
285
 
2.3%
231
 
1.8%
229
 
1.8%
216
 
1.7%
Other values (447) 8684
68.9%
Uppercase Letter
ValueCountFrequency (%)
S 38
15.3%
C 34
13.7%
L 23
9.2%
K 23
9.2%
G 16
 
6.4%
M 15
 
6.0%
I 13
 
5.2%
P 11
 
4.4%
O 11
 
4.4%
D 10
 
4.0%
Other values (13) 55
22.1%
Lowercase Letter
ValueCountFrequency (%)
l 4
14.3%
a 3
10.7%
s 3
10.7%
p 2
 
7.1%
r 2
 
7.1%
e 2
 
7.1%
i 2
 
7.1%
o 2
 
7.1%
t 1
 
3.6%
h 1
 
3.6%
Other values (6) 6
21.4%
Decimal Number
ValueCountFrequency (%)
2 71
45.5%
1 45
28.8%
3 16
 
10.3%
4 10
 
6.4%
5 5
 
3.2%
7 4
 
2.6%
6 2
 
1.3%
0 1
 
0.6%
8 1
 
0.6%
9 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 11
55.0%
, 6
30.0%
/ 2
 
10.0%
& 1
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 853
99.4%
[ 5
 
0.6%
Close Punctuation
ValueCountFrequency (%)
) 853
99.4%
] 5
 
0.6%
Space Separator
ValueCountFrequency (%)
505
100.0%
Other Symbol
ValueCountFrequency (%)
402
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 39
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13000
82.7%
Common 2448
 
15.6%
Latin 277
 
1.8%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
998
 
7.7%
545
 
4.2%
501
 
3.9%
402
 
3.1%
306
 
2.4%
305
 
2.3%
299
 
2.3%
285
 
2.2%
231
 
1.8%
229
 
1.8%
Other values (447) 8899
68.5%
Latin
ValueCountFrequency (%)
S 38
13.7%
C 34
12.3%
L 23
 
8.3%
K 23
 
8.3%
G 16
 
5.8%
M 15
 
5.4%
I 13
 
4.7%
P 11
 
4.0%
O 11
 
4.0%
D 10
 
3.6%
Other values (29) 83
30.0%
Common
ValueCountFrequency (%)
( 853
34.8%
) 853
34.8%
505
20.6%
2 71
 
2.9%
1 45
 
1.8%
_ 39
 
1.6%
3 16
 
0.7%
- 12
 
0.5%
. 11
 
0.4%
4 10
 
0.4%
Other values (11) 33
 
1.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12598
80.1%
ASCII 2725
 
17.3%
None 402
 
2.6%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
998
 
7.9%
545
 
4.3%
501
 
4.0%
306
 
2.4%
305
 
2.4%
299
 
2.4%
285
 
2.3%
231
 
1.8%
229
 
1.8%
216
 
1.7%
Other values (446) 8683
68.9%
ASCII
ValueCountFrequency (%)
( 853
31.3%
) 853
31.3%
505
18.5%
2 71
 
2.6%
1 45
 
1.7%
_ 39
 
1.4%
S 38
 
1.4%
C 34
 
1.2%
L 23
 
0.8%
K 23
 
0.8%
Other values (50) 241
 
8.8%
None
ValueCountFrequency (%)
402
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct1548
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-12T16:57:51.101348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length27
Mean length20.225108
Min length13

Characters and Unicode

Total characters32704
Distinct characters349
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1489 ?
Unique (%)92.1%

Sample

1st row충청남도 서산시 대산읍 독곶1로 54
2nd row충청북도 청주시 흥덕구 옥산면 과학산업3로 29
3rd row충청남도 아산시 실옥로 174
4th row충청남도 아산시 환경공원로 91
5th row충청남도 당진시 석문면 산단2로 95
ValueCountFrequency (%)
경기도 388
 
5.2%
대구광역시 182
 
2.5%
서구 157
 
2.1%
경상북도 125
 
1.7%
충청북도 124
 
1.7%
전라북도 120
 
1.6%
안산시 118
 
1.6%
단원구 115
 
1.6%
울산광역시 112
 
1.5%
남구 101
 
1.4%
Other values (1992) 5868
79.2%
2023-12-12T16:57:51.838207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5805
 
17.8%
1549
 
4.7%
1397
 
4.3%
1 1145
 
3.5%
1104
 
3.4%
1065
 
3.3%
882
 
2.7%
2 833
 
2.5%
3 678
 
2.1%
632
 
1.9%
Other values (339) 17614
53.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21011
64.2%
Space Separator 5805
 
17.8%
Decimal Number 5654
 
17.3%
Dash Punctuation 232
 
0.7%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1549
 
7.4%
1397
 
6.6%
1104
 
5.3%
1065
 
5.1%
882
 
4.2%
632
 
3.0%
614
 
2.9%
593
 
2.8%
558
 
2.7%
519
 
2.5%
Other values (325) 12098
57.6%
Decimal Number
ValueCountFrequency (%)
1 1145
20.3%
2 833
14.7%
3 678
12.0%
4 500
8.8%
5 487
8.6%
7 475
8.4%
0 432
 
7.6%
6 419
 
7.4%
8 360
 
6.4%
9 325
 
5.7%
Other Punctuation
ValueCountFrequency (%)
· 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
5805
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21011
64.2%
Common 11693
35.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1549
 
7.4%
1397
 
6.6%
1104
 
5.3%
1065
 
5.1%
882
 
4.2%
632
 
3.0%
614
 
2.9%
593
 
2.8%
558
 
2.7%
519
 
2.5%
Other values (325) 12098
57.6%
Common
ValueCountFrequency (%)
5805
49.6%
1 1145
 
9.8%
2 833
 
7.1%
3 678
 
5.8%
4 500
 
4.3%
5 487
 
4.2%
7 475
 
4.1%
0 432
 
3.7%
6 419
 
3.6%
8 360
 
3.1%
Other values (4) 559
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21011
64.2%
ASCII 11692
35.8%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5805
49.6%
1 1145
 
9.8%
2 833
 
7.1%
3 678
 
5.8%
4 500
 
4.3%
5 487
 
4.2%
7 475
 
4.1%
0 432
 
3.7%
6 419
 
3.6%
8 360
 
3.1%
Other values (3) 558
 
4.8%
Hangul
ValueCountFrequency (%)
1549
 
7.4%
1397
 
6.6%
1104
 
5.3%
1065
 
5.1%
882
 
4.2%
632
 
3.0%
614
 
2.9%
593
 
2.8%
558
 
2.7%
519
 
2.5%
Other values (325) 12098
57.6%
None
ValueCountFrequency (%)
· 1
100.0%

개별처리 후 직접방류
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1340 
O
277 

Length

Max length4
Median length4
Mean length3.4860853
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowO
2nd row<NA>
3rd rowO
4th rowO
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1340
82.9%
O 277
 
17.1%

Length

2023-12-12T16:57:52.063289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:52.208453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1340
82.9%
o 277
 
17.1%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
O
829 
<NA>
788 

Length

Max length4
Median length1
Mean length2.4619666
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowO
2nd rowO
3rd row<NA>
4th row<NA>
5th rowO

Common Values

ValueCountFrequency (%)
O 829
51.3%
<NA> 788
48.7%

Length

2023-12-12T16:57:52.694595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:52.839917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
o 829
51.3%
na 788
48.7%

원폐수를 공공하폐수처리시설로 유입처리
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1498 
O
 
119

Length

Max length4
Median length4
Mean length3.7792208
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1498
92.6%
O 119
 
7.4%

Length

2023-12-12T16:57:52.975269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:53.103881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1498
92.6%
o 119
 
7.4%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1186 
O
431 

Length

Max length4
Median length4
Mean length3.2003711
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1186
73.3%
O 431
 
26.7%

Length

2023-12-12T16:57:53.236309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:53.368523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1186
73.3%
o 431
 
26.7%

위탁처리
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1371 
O
246 

Length

Max length4
Median length4
Mean length3.5435993
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowO
2nd rowO
3rd rowO
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1371
84.8%
O 246
 
15.2%

Length

2023-12-12T16:57:53.486583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:53.595001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1371
84.8%
o 246
 
15.2%

기타처리
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
<NA>
1530 
O
 
87

Length

Max length4
Median length4
Mean length3.83859
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd rowO
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1530
94.6%
O 87
 
5.4%

Length

2023-12-12T16:57:53.714510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:57:53.834143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1530
94.6%
o 87
 
5.4%

폐수처리장코드
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
02|
632 
04|
368 
01|
191 
02|05|
112 
03|
95 
Other values (29)
219 

Length

Max length12
Median length3
Mean length3.690167
Min length3

Unique

Unique6 ?
Unique (%)0.4%

Sample

1st row01|02|05|
2nd row02|05|06|
3rd row01|05|
4th row01|
5th row02|

Common Values

ValueCountFrequency (%)
02| 632
39.1%
04| 368
22.8%
01| 191
 
11.8%
02|05| 112
 
6.9%
03| 95
 
5.9%
01|05| 47
 
2.9%
04|05| 28
 
1.7%
02|06| 26
 
1.6%
02|05|06| 14
 
0.9%
04|06| 11
 
0.7%
Other values (24) 93
 
5.8%

Length

2023-12-12T16:57:53.949676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
02 632
39.1%
04 368
22.8%
01 191
 
11.8%
02|05 112
 
6.9%
03 95
 
5.9%
01|05 47
 
2.9%
04|05 28
 
1.7%
02|06 26
 
1.6%
02|05|06 14
 
0.9%
04|06 11
 
0.7%
Other values (24) 93
 
5.8%

공개여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
True
1617 
ValueCountFrequency (%)
True 1617
100.0%
2023-12-12T16:57:54.056624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T16:57:47.489038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:57:47.209434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:57:47.652833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:57:47.347533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:57:54.136513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번관할기관명관할기관코드폐수처리장코드
순번1.0000.9360.8380.445
관할기관명0.9361.0001.0000.509
관할기관코드0.8381.0001.0000.498
폐수처리장코드0.4450.5090.4981.000
2023-12-12T16:57:54.270497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기타처리원폐수를 공동방지시설로 유입처리관할기관명개별처리 후 직접방류폐수처리장코드위탁처리개별처리 후 공공하폐수처리시설로 유입처리원폐수를 공공하폐수처리시설로 유입처리
기타처리1.0001.0001.0001.0001.0001.0001.0001.000
원폐수를 공동방지시설로 유입처리1.0001.0001.0001.0001.0001.0001.000NaN
관할기관명1.0001.0001.0001.0000.2331.0001.0001.000
개별처리 후 직접방류1.0001.0001.0001.0001.0001.0001.0001.000
폐수처리장코드1.0001.0000.2331.0001.0001.0001.0001.000
위탁처리1.0001.0001.0001.0001.0001.0001.0001.000
개별처리 후 공공하폐수처리시설로 유입처리1.0001.0001.0001.0001.0001.0001.0001.000
원폐수를 공공하폐수처리시설로 유입처리1.000NaN1.0001.0001.0001.0001.0001.000
2023-12-12T16:57:54.418844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번관할기관코드관할기관명개별처리 후 직접방류개별처리 후 공공하폐수처리시설로 유입처리원폐수를 공공하폐수처리시설로 유입처리원폐수를 공동방지시설로 유입처리위탁처리기타처리폐수처리장코드
순번1.000-0.5650.8351.0001.0001.0001.0001.0001.0000.171
관할기관코드-0.5651.0000.9991.0001.0001.0001.0001.0001.0000.265
관할기관명0.8350.9991.0001.0001.0001.0001.0001.0001.0000.233
개별처리 후 직접방류1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
개별처리 후 공공하폐수처리시설로 유입처리1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
원폐수를 공공하폐수처리시설로 유입처리1.0001.0001.0001.0001.0001.0000.0001.0001.0001.000
원폐수를 공동방지시설로 유입처리1.0001.0001.0001.0001.0000.0001.0001.0001.0001.000
위탁처리1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
기타처리1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
폐수처리장코드0.1710.2650.2331.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-12T16:57:47.856504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:57:48.129048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번기준년안내관할기관명관할기관코드사업장명주소명개별처리 후 직접방류개별처리 후 공공하폐수처리시설로 유입처리원폐수를 공공하폐수처리시설로 유입처리원폐수를 공동방지시설로 유입처리위탁처리기타처리폐수처리장코드공개여부
012020C1B68F97595572C3E05367020202C5BA금강유역환경청6300(주)LG화학 대산공장충청남도 서산시 대산읍 독곶1로 54OO<NA><NA>O<NA>01|02|05|Y
122020C1B47266239D6E5EE0536702020296B4금강유역환경청6300(주)LG화학 오창1공장충청북도 청주시 흥덕구 옥산면 과학산업3로 29<NA>O<NA><NA>OO02|05|06|Y
232020C2F6769E18976C86E05367020202F83E금강유역환경청6300(주)경보제약충청남도 아산시 실옥로 174O<NA><NA><NA>O<NA>01|05|Y
342020C201255DDC042AE3E053670202025E17금강유역환경청6300(주)그린이에스충청남도 아산시 환경공원로 91O<NA><NA><NA><NA><NA>01|Y
452020C17AD63328A364AAE053670202022CBD금강유역환경청6300(주)기산화학충청남도 당진시 석문면 산단2로 95<NA>O<NA><NA><NA><NA>02|Y
562020C1F5251392FF1AD4E053670202021938금강유역환경청6300(주)네오텍충청북도 청주시 흥덕구 2순환로742번길 51<NA>O<NA><NA>OO02|05|06|Y
672020C21DAA6019CE55F6E05367020202D26A금강유역환경청6300(주)네패스 오창2공장충청북도 청주시 청원구 오창읍 과학산업2로 587-32<NA><NA>O<NA>OO03|05|06|Y
782020C19F3D5078D24407E05367020202F212금강유역환경청6300(주)녹십자 오창공장충청북도 청주시 청원구 오창읍 과학산업2로 586<NA>O<NA><NA><NA><NA>02|Y
892020C321E96AFE6B5384E05367020202BD04금강유역환경청6300(주)농심미분충청남도 아산시 탕정면 탕정면로 485<NA><NA><NA>O<NA><NA>04|Y
9102020C27F884FD35B401AE05367020202F75A금강유역환경청6300(주)농협홍삼충청북도 증평군 증평읍 중앙로 88<NA>O<NA><NA>O<NA>02|05|Y
순번기준년안내관할기관명관할기관코드사업장명주소명개별처리 후 직접방류개별처리 후 공공하폐수처리시설로 유입처리원폐수를 공공하폐수처리시설로 유입처리원폐수를 공동방지시설로 유입처리위탁처리기타처리폐수처리장코드공개여부
160716082020C2589B3DA7705D3CE0536702020215EC한강유역환경청6100한미헬스케어(주)경기도 평택시 산단로 106<NA>O<NA><NA><NA><NA>02|Y
160816092020C27F3762C4BB0305E053670202025ECB한강유역환경청6100한양대학교서울특별시 성동구 왕십리로 222<NA>O<NA><NA><NA><NA>02|Y
160916102020C344280DBC7350A5E0536702020223DF한강유역환경청6100한온시스템경기도 평택시 포승읍 하만호길 32-1O<NA><NA><NA><NA><NA>01|Y
161016112020C23ED51D23D14653E05367020202ACD8한강유역환경청6100핸즈코퍼레이션㈜2공장인천광역시 서구 가정로37번길 50O<NA><NA><NA><NA><NA>01|Y
161116122020C3A1AE99BB275918E053670202020A05한강유역환경청6100현대상사경기도 동두천시 강변로730번길 43-13<NA><NA><NA>O<NA><NA>04|Y
161216132020C1F13C1AC28C68B0E0536702020220FD한강유역환경청6100현대자동차(주)남양연구소경기도 화성시 남양읍 현대연구소로 150O<NA><NA><NA><NA><NA>01|Y
161316142020C3590C46462027CAE0536702020256E8한강유역환경청6100현대제철(주)인천광역시 동구 중봉대로 63<NA>O<NA><NA>O<NA>02|05|Y
161416152020C168433B6C165AD7E053670202020CDE한강유역환경청6100현우산업(주)인천광역시 서구 오류동 검단로 51<NA>O<NA><NA>O<NA>02|05|Y
161516162020C2CDA38CB91F67D2E05367020202B0B2한강유역환경청6100현진전자(주)경기도 안산시 단원구 산단로 231<NA>O<NA><NA><NA><NA>02|Y
161616172020C36FAEBF23E31AFFE05367020202B780한강유역환경청6100흥일염공㈜경기도 시흥시 협력로 22<NA><NA><NA>O<NA><NA>04|Y