Overview

Dataset statistics

Number of variables13
Number of observations3902
Missing cells13
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory415.5 KiB
Average record size in memory109.0 B

Variable types

Text4
Categorical5
Numeric4

Dataset

Description환경정보공개시스템 2023년도 7월 기준 사업체별(대표사업장 기준) 폐기물 발생 상세 정보(일반폐기물, 지정폐기물, 건설폐기물 발생량 등)
URLhttps://www.data.go.kr/data/15103093/fileData.do

Alerts

년도 has constant value ""Constant
일반폐기물 is highly overall correlated with 폐기물발생량총량High correlation
지정폐기물 is highly overall correlated with 폐기물발생량총량High correlation
폐기물발생량총량 is highly overall correlated with 일반폐기물 and 1 other fieldsHigh correlation
사업장구분 is highly overall correlated with 특성High correlation
유형 is highly overall correlated with 특성 and 1 other fieldsHigh correlation
특성 is highly overall correlated with 사업장구분 and 1 other fieldsHigh correlation
업종 is highly overall correlated with 유형High correlation
일반폐기물 is highly skewed (γ1 = 34.85329003)Skewed
지정폐기물 is highly skewed (γ1 = 55.15897999)Skewed
건설폐기물 is highly skewed (γ1 = 25.49967008)Skewed
폐기물발생량총량 is highly skewed (γ1 = 33.96077494)Skewed
일반폐기물 has 103 (2.6%) zerosZeros
지정폐기물 has 1922 (49.3%) zerosZeros
건설폐기물 has 3508 (89.9%) zerosZeros
폐기물발생량총량 has 74 (1.9%) zerosZeros

Reproduction

Analysis started2023-12-12 11:27:14.411787
Analysis finished2023-12-12 11:27:20.666222
Duration6.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1824
Distinct (%)46.7%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
2023-12-12T20:27:20.953992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters78040
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1308 ?
Unique (%)33.5%

Sample

1st row00000000000000095296
2nd row00000000000000095296
3rd row00000000000000095296
4th row00000000000000095297
5th row00000000000000095300
ValueCountFrequency (%)
00000000000000148908 94
 
2.4%
00000000000000148293 76
 
1.9%
00000000000000162852 61
 
1.6%
00000000000000149636 57
 
1.5%
00000000000000259804 31
 
0.8%
00000000000000357473 26
 
0.7%
00000000000000171242 23
 
0.6%
00000000000000195126 22
 
0.6%
00000000000000148676 20
 
0.5%
00000000000000145328 19
 
0.5%
Other values (1814) 3473
89.0%
2023-12-12T20:27:21.587042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 55817
71.5%
1 3490
 
4.5%
2 2654
 
3.4%
4 2431
 
3.1%
8 2354
 
3.0%
9 2025
 
2.6%
5 2005
 
2.6%
6 1845
 
2.4%
3 1674
 
2.1%
7 1583
 
2.0%
Other values (3) 2162
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 75878
97.2%
Uppercase Letter 2162
 
2.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 55817
73.6%
1 3490
 
4.6%
2 2654
 
3.5%
4 2431
 
3.2%
8 2354
 
3.1%
9 2025
 
2.7%
5 2005
 
2.6%
6 1845
 
2.4%
3 1674
 
2.2%
7 1583
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
C 1081
50.0%
T 922
42.6%
M 159
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Common 75878
97.2%
Latin 2162
 
2.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 55817
73.6%
1 3490
 
4.6%
2 2654
 
3.5%
4 2431
 
3.2%
8 2354
 
3.1%
9 2025
 
2.7%
5 2005
 
2.6%
6 1845
 
2.4%
3 1674
 
2.2%
7 1583
 
2.1%
Latin
ValueCountFrequency (%)
C 1081
50.0%
T 922
42.6%
M 159
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 55817
71.5%
1 3490
 
4.5%
2 2654
 
3.4%
4 2431
 
3.1%
8 2354
 
3.0%
9 2025
 
2.6%
5 2005
 
2.6%
6 1845
 
2.4%
3 1674
 
2.1%
7 1583
 
2.0%
Other values (3) 2162
 
2.8%
Distinct1824
Distinct (%)46.7%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
2023-12-12T20:27:22.040419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length22
Mean length8.8385443
Min length2

Characters and Unicode

Total characters34488
Distinct characters526
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1308 ?
Unique (%)33.5%

Sample

1st row한국지엠(주) 부평공장 (본사)
2nd row한국지엠(주) 부평공장 (본사)
3rd row한국지엠(주) 부평공장 (본사)
4th rowROHMKOREA 대전공장
5th row롯데정밀화학(주)
ValueCountFrequency (%)
본사 271
 
5.2%
주식회사 153
 
3.0%
주)이마트(본사+성수점 94
 
1.8%
본점(본사 76
 
1.5%
롯데쇼핑(주 76
 
1.5%
한국전력공사 61
 
1.2%
홈플러스(주 57
 
1.1%
한국수자원공사 31
 
0.6%
본부 28
 
0.5%
대표사업장 28
 
0.5%
Other values (2003) 4310
83.1%
2023-12-12T20:27:22.663697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2085
 
6.0%
) 1987
 
5.8%
( 1978
 
5.7%
1334
 
3.9%
1298
 
3.8%
847
 
2.5%
813
 
2.4%
759
 
2.2%
675
 
2.0%
647
 
1.9%
Other values (516) 22065
64.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28116
81.5%
Close Punctuation 1987
 
5.8%
Open Punctuation 1978
 
5.7%
Space Separator 1298
 
3.8%
Uppercase Letter 718
 
2.1%
Lowercase Letter 101
 
0.3%
Math Symbol 94
 
0.3%
Decimal Number 90
 
0.3%
Connector Punctuation 51
 
0.1%
Other Punctuation 41
 
0.1%
Other values (2) 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2085
 
7.4%
1334
 
4.7%
847
 
3.0%
813
 
2.9%
759
 
2.7%
675
 
2.4%
647
 
2.3%
534
 
1.9%
522
 
1.9%
486
 
1.7%
Other values (457) 19414
69.0%
Uppercase Letter
ValueCountFrequency (%)
S 137
19.1%
C 100
13.9%
K 86
12.0%
G 57
7.9%
L 56
7.8%
D 44
 
6.1%
J 36
 
5.0%
E 32
 
4.5%
I 29
 
4.0%
T 23
 
3.2%
Other values (14) 118
16.4%
Lowercase Letter
ValueCountFrequency (%)
t 23
22.8%
k 16
15.8%
s 13
12.9%
a 9
 
8.9%
p 8
 
7.9%
e 7
 
6.9%
u 6
 
5.9%
m 6
 
5.9%
b 3
 
3.0%
l 2
 
2.0%
Other values (7) 8
 
7.9%
Decimal Number
ValueCountFrequency (%)
1 60
66.7%
8 8
 
8.9%
2 7
 
7.8%
5 6
 
6.7%
3 4
 
4.4%
4 3
 
3.3%
7 2
 
2.2%
Other Punctuation
ValueCountFrequency (%)
& 31
75.6%
, 6
 
14.6%
. 3
 
7.3%
/ 1
 
2.4%
Close Punctuation
ValueCountFrequency (%)
) 1987
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1978
100.0%
Space Separator
ValueCountFrequency (%)
1298
100.0%
Math Symbol
ValueCountFrequency (%)
+ 94
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 51
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28118
81.5%
Common 5551
 
16.1%
Latin 819
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2085
 
7.4%
1334
 
4.7%
847
 
3.0%
813
 
2.9%
759
 
2.7%
675
 
2.4%
647
 
2.3%
534
 
1.9%
522
 
1.9%
486
 
1.7%
Other values (458) 19416
69.1%
Latin
ValueCountFrequency (%)
S 137
16.7%
C 100
12.2%
K 86
 
10.5%
G 57
 
7.0%
L 56
 
6.8%
D 44
 
5.4%
J 36
 
4.4%
E 32
 
3.9%
I 29
 
3.5%
T 23
 
2.8%
Other values (31) 219
26.7%
Common
ValueCountFrequency (%)
) 1987
35.8%
( 1978
35.6%
1298
23.4%
+ 94
 
1.7%
1 60
 
1.1%
_ 51
 
0.9%
& 31
 
0.6%
- 12
 
0.2%
8 8
 
0.1%
2 7
 
0.1%
Other values (7) 25
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28116
81.5%
ASCII 6370
 
18.5%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2085
 
7.4%
1334
 
4.7%
847
 
3.0%
813
 
2.9%
759
 
2.7%
675
 
2.4%
647
 
2.3%
534
 
1.9%
522
 
1.9%
486
 
1.7%
Other values (457) 19414
69.0%
ASCII
ValueCountFrequency (%)
) 1987
31.2%
( 1978
31.1%
1298
20.4%
S 137
 
2.2%
C 100
 
1.6%
+ 94
 
1.5%
K 86
 
1.4%
1 60
 
0.9%
G 57
 
0.9%
L 56
 
0.9%
Other values (48) 517
 
8.1%
None
ValueCountFrequency (%)
2
100.0%
Distinct3901
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
2023-12-12T20:27:23.129579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length23
Mean length10.718093
Min length2

Characters and Unicode

Total characters41822
Distinct characters591
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3900 ?
Unique (%)99.9%

Sample

1st row한국지엠(주) 부평공장 (본사)
2nd row한국지엠(주) 보령공장
3rd row한국지엠(주) 창원공장
4th rowROHMKOREA 대전공장
5th row롯데정밀화학(주)
ValueCountFrequency (%)
주식회사 116
 
2.0%
본사 95
 
1.6%
주)이마트 87
 
1.5%
홈플러스 56
 
1.0%
롯데백화점 29
 
0.5%
롯데마트 27
 
0.5%
대표사업장 22
 
0.4%
울산공장 21
 
0.4%
주)엘지화학 19
 
0.3%
현대백화점 18
 
0.3%
Other values (3865) 5380
91.7%
2023-12-12T20:27:23.773108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2018
 
4.8%
1995
 
4.8%
) 1639
 
3.9%
( 1635
 
3.9%
1407
 
3.4%
1065
 
2.5%
1037
 
2.5%
828
 
2.0%
669
 
1.6%
648
 
1.5%
Other values (581) 28881
69.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34659
82.9%
Space Separator 1995
 
4.8%
Close Punctuation 1639
 
3.9%
Open Punctuation 1635
 
3.9%
Uppercase Letter 918
 
2.2%
Connector Punctuation 558
 
1.3%
Decimal Number 246
 
0.6%
Lowercase Letter 96
 
0.2%
Other Punctuation 56
 
0.1%
Dash Punctuation 12
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2018
 
5.8%
1407
 
4.1%
1065
 
3.1%
1037
 
3.0%
828
 
2.4%
669
 
1.9%
648
 
1.9%
637
 
1.8%
598
 
1.7%
586
 
1.7%
Other values (518) 25166
72.6%
Uppercase Letter
ValueCountFrequency (%)
S 134
14.6%
C 121
13.2%
K 90
9.8%
L 65
 
7.1%
G 63
 
6.9%
D 57
 
6.2%
T 51
 
5.6%
I 50
 
5.4%
B 35
 
3.8%
J 34
 
3.7%
Other values (14) 218
23.7%
Lowercase Letter
ValueCountFrequency (%)
t 18
18.8%
k 14
14.6%
a 11
11.5%
s 9
9.4%
e 7
 
7.3%
p 5
 
5.2%
b 4
 
4.2%
o 4
 
4.2%
l 4
 
4.2%
u 3
 
3.1%
Other values (8) 17
17.7%
Decimal Number
ValueCountFrequency (%)
2 97
39.4%
1 80
32.5%
3 29
 
11.8%
4 13
 
5.3%
5 10
 
4.1%
0 8
 
3.3%
6 4
 
1.6%
8 2
 
0.8%
7 2
 
0.8%
9 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
& 39
69.6%
, 12
 
21.4%
. 3
 
5.4%
/ 2
 
3.6%
Space Separator
ValueCountFrequency (%)
1995
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1639
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1635
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 558
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34666
82.9%
Common 6142
 
14.7%
Latin 1014
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2018
 
5.8%
1407
 
4.1%
1065
 
3.1%
1037
 
3.0%
828
 
2.4%
669
 
1.9%
648
 
1.9%
637
 
1.8%
598
 
1.7%
586
 
1.7%
Other values (519) 25173
72.6%
Latin
ValueCountFrequency (%)
S 134
13.2%
C 121
 
11.9%
K 90
 
8.9%
L 65
 
6.4%
G 63
 
6.2%
D 57
 
5.6%
T 51
 
5.0%
I 50
 
4.9%
B 35
 
3.5%
J 34
 
3.4%
Other values (32) 314
31.0%
Common
ValueCountFrequency (%)
1995
32.5%
) 1639
26.7%
( 1635
26.6%
_ 558
 
9.1%
2 97
 
1.6%
1 80
 
1.3%
& 39
 
0.6%
3 29
 
0.5%
4 13
 
0.2%
- 12
 
0.2%
Other values (10) 45
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34659
82.9%
ASCII 7156
 
17.1%
None 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2018
 
5.8%
1407
 
4.1%
1065
 
3.1%
1037
 
3.0%
828
 
2.4%
669
 
1.9%
648
 
1.9%
637
 
1.8%
598
 
1.7%
586
 
1.7%
Other values (518) 25166
72.6%
ASCII
ValueCountFrequency (%)
1995
27.9%
) 1639
22.9%
( 1635
22.8%
_ 558
 
7.8%
S 134
 
1.9%
C 121
 
1.7%
2 97
 
1.4%
K 90
 
1.3%
1 80
 
1.1%
L 65
 
0.9%
Other values (52) 742
 
10.4%
None
ValueCountFrequency (%)
7
100.0%

사업장구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
사업장
2080 
대표사업장
1822 

Length

Max length5
Median length3
Mean length3.9338801
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대표사업장
2nd row사업장
3rd row사업장
4th row대표사업장
5th row대표사업장

Common Values

ValueCountFrequency (%)
사업장 2080
53.3%
대표사업장 1822
46.7%

Length

2023-12-12T20:27:24.023220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:27:24.198438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사업장 2080
53.3%
대표사업장 1822
46.7%

유형
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
제조
1511 
공공행정
1099 
기타서비스
851 
기타산업
253 
교육서비스
 
113

Length

Max length5
Median length4
Mean length3.4341363
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조
2nd row제조
3rd row제조
4th row제조
5th row제조

Common Values

ValueCountFrequency (%)
제조 1511
38.7%
공공행정 1099
28.2%
기타서비스 851
21.8%
기타산업 253
 
6.5%
교육서비스 113
 
2.9%
보건 75
 
1.9%

Length

2023-12-12T20:27:24.375309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:27:24.571979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조 1511
38.7%
공공행정 1099
28.2%
기타서비스 851
21.8%
기타산업 253
 
6.5%
교육서비스 113
 
2.9%
보건 75
 
1.9%

특성
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
배출권할당대상업체
2161 
온실가스목표관리업체
434 
공공기관
313 
지방자치단체
296 
주권상장법인
233 
Other values (7)
465 

Length

Max length10
Median length9
Mean length7.7608919
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row배출권할당대상업체
2nd row배출권할당대상업체
3rd row배출권할당대상업체
4th row온실가스목표관리업체
5th row주권상장법인

Common Values

ValueCountFrequency (%)
배출권할당대상업체 2161
55.4%
온실가스목표관리업체 434
 
11.1%
공공기관 313
 
8.0%
지방자치단체 296
 
7.6%
주권상장법인 233
 
6.0%
지방공단 117
 
3.0%
녹색기업 107
 
2.7%
지방공사 77
 
2.0%
중앙행정기관 77
 
2.0%
국공립대학 52
 
1.3%
Other values (2) 35
 
0.9%

Length

2023-12-12T20:27:24.769500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
배출권할당대상업체 2161
55.4%
온실가스목표관리업체 434
 
11.1%
공공기관 313
 
8.0%
지방자치단체 296
 
7.6%
주권상장법인 233
 
6.0%
지방공단 117
 
3.0%
녹색기업 107
 
2.7%
지방공사 77
 
2.0%
중앙행정기관 77
 
2.0%
국공립대학 52
 
1.3%
Other values (2) 35
 
0.9%

업종
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
제조업
1473 
공공행정, 국방 및 사회보장 행정
567 
도매 및 소매업
316 
전기, 가스, 증기 및 수도사업
307 
하수·폐기물 처리, 원료재생 및 환경복원업
283 
Other values (15)
956 

Length

Max length24
Median length23
Mean length10.210661
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row제조업
4th row제조업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 1473
37.7%
공공행정, 국방 및 사회보장 행정 567
 
14.5%
도매 및 소매업 316
 
8.1%
전기, 가스, 증기 및 수도사업 307
 
7.9%
하수·폐기물 처리, 원료재생 및 환경복원업 283
 
7.3%
운수업 151
 
3.9%
전문, 과학 및 기술 서비스업 127
 
3.3%
교육 서비스업 110
 
2.8%
보건업 및 사회복지 서비스업 98
 
2.5%
출판, 영상, 방송통신 및 정보서비스업 95
 
2.4%
Other values (10) 375
 
9.6%

Length

2023-12-12T20:27:25.007971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2121
18.8%
제조업 1473
 
13.1%
사회보장 567
 
5.0%
행정 567
 
5.0%
공공행정 567
 
5.0%
국방 567
 
5.0%
서비스업 465
 
4.1%
소매업 316
 
2.8%
도매 316
 
2.8%
전기 307
 
2.7%
Other values (40) 4018
35.6%
Distinct69
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
2023-12-12T20:27:25.406754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length13.814967
Min length2

Characters and Unicode

Total characters53906
Distinct characters162
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.2%

Sample

1st row자동차 및 트레일러 제조업
2nd row자동차 및 트레일러 제조업
3rd row자동차 및 트레일러 제조업
4th row전자부품, 컴퓨터, 영상, 음향 및 통신장비 제조업
5th row화학물질 및 화학제품 제조업;의약품 제외
ValueCountFrequency (%)
2150
 
14.6%
제조업 1171
 
8.0%
제외 629
 
4.3%
공공행정 567
 
3.9%
국방 567
 
3.9%
사회보장 567
 
3.9%
행정 567
 
3.9%
자동차 403
 
2.7%
서비스업 325
 
2.2%
소매업 301
 
2.0%
Other values (138) 7458
50.7%
2023-12-12T20:27:26.089714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10803
20.0%
3420
 
6.3%
3010
 
5.6%
2150
 
4.0%
, 1862
 
3.5%
1742
 
3.2%
1661
 
3.1%
1412
 
2.6%
1188
 
2.2%
1137
 
2.1%
Other values (152) 25521
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40396
74.9%
Space Separator 10803
 
20.0%
Other Punctuation 2543
 
4.7%
Decimal Number 164
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3420
 
8.5%
3010
 
7.5%
2150
 
5.3%
1742
 
4.3%
1661
 
4.1%
1412
 
3.5%
1188
 
2.9%
1137
 
2.8%
1134
 
2.8%
777
 
1.9%
Other values (147) 22765
56.4%
Other Punctuation
ValueCountFrequency (%)
, 1862
73.2%
; 676
 
26.6%
· 5
 
0.2%
Space Separator
ValueCountFrequency (%)
10803
100.0%
Decimal Number
ValueCountFrequency (%)
1 164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40396
74.9%
Common 13510
 
25.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3420
 
8.5%
3010
 
7.5%
2150
 
5.3%
1742
 
4.3%
1661
 
4.1%
1412
 
3.5%
1188
 
2.9%
1137
 
2.8%
1134
 
2.8%
777
 
1.9%
Other values (147) 22765
56.4%
Common
ValueCountFrequency (%)
10803
80.0%
, 1862
 
13.8%
; 676
 
5.0%
1 164
 
1.2%
· 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40396
74.9%
ASCII 13505
 
25.1%
None 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10803
80.0%
, 1862
 
13.8%
; 676
 
5.0%
1 164
 
1.2%
Hangul
ValueCountFrequency (%)
3420
 
8.5%
3010
 
7.5%
2150
 
5.3%
1742
 
4.3%
1661
 
4.1%
1412
 
3.5%
1188
 
2.9%
1137
 
2.8%
1134
 
2.8%
777
 
1.9%
Other values (147) 22765
56.4%
None
ValueCountFrequency (%)
· 5
100.0%

년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.6 KiB
2021
3902 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 3902
100.0%

Length

2023-12-12T20:27:26.327377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:27:26.506634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 3902
100.0%

일반폐기물
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct3504
Distinct (%)89.9%
Missing3
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean15964.825
Minimum0
Maximum11902864
Zeros103
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2023-12-12T20:27:26.724427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.130916
Q127.975
median235.503
Q31795.9
95-th percentile26068.507
Maximum11902864
Range11902864
Interquartile range (IQR)1767.925

Descriptive statistics

Standard deviation290355.39
Coefficient of variation (CV)18.187195
Kurtosis1273.0473
Mean15964.825
Median Absolute Deviation (MAD)231.69675
Skewness34.85329
Sum62246852
Variance8.4306253 × 1010
MonotonicityNot monotonic
2023-12-12T20:27:27.013631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 103
 
2.6%
15.0 15
 
0.4%
1.5 8
 
0.2%
24.0 7
 
0.2%
12.0 7
 
0.2%
45.0 7
 
0.2%
37.5 7
 
0.2%
42.0 7
 
0.2%
2.5 7
 
0.2%
0.75 7
 
0.2%
Other values (3494) 3724
95.4%
ValueCountFrequency (%)
0.0 103
2.6%
0.015 1
 
< 0.1%
0.0349 1
 
< 0.1%
0.035 1
 
< 0.1%
0.05 3
 
0.1%
0.06 1
 
< 0.1%
0.075 2
 
0.1%
0.1 2
 
0.1%
0.1125 1
 
< 0.1%
0.12 1
 
< 0.1%
ValueCountFrequency (%)
11902863.73 1
< 0.1%
9809265.17 1
< 0.1%
8860309.0 1
< 0.1%
1453022.0 1
< 0.1%
1433012.0 1
< 0.1%
1285580.77 1
< 0.1%
1249774.0 1
< 0.1%
1175164.08 1
< 0.1%
960632.4 1
< 0.1%
769565.74 1
< 0.1%

지정폐기물
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1792
Distinct (%)46.0%
Missing4
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1169.5803
Minimum0
Maximum1225980
Zeros1922
Zeros (%)49.3%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2023-12-12T20:27:27.297845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.225
Q364.175
95-th percentile2639.8955
Maximum1225980
Range1225980
Interquartile range (IQR)64.175

Descriptive statistics

Standard deviation20500.855
Coefficient of variation (CV)17.528387
Kurtosis3273.9696
Mean1169.5803
Median Absolute Deviation (MAD)0.225
Skewness55.15898
Sum4559023.9
Variance4.2028507 × 108
MonotonicityNot monotonic
2023-12-12T20:27:27.586617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 1922
49.3%
2.0 9
 
0.2%
1.0 8
 
0.2%
0.3 8
 
0.2%
1.5 6
 
0.2%
0.4 6
 
0.2%
0.6 6
 
0.2%
1.2 5
 
0.1%
0.8 5
 
0.1%
5.0 5
 
0.1%
Other values (1782) 1918
49.2%
ValueCountFrequency (%)
0.0 1922
49.3%
0.0099 1
 
< 0.1%
0.013 1
 
< 0.1%
0.02 1
 
< 0.1%
0.03 2
 
0.1%
0.031 1
 
< 0.1%
0.04 1
 
< 0.1%
0.07 2
 
0.1%
0.0835 1
 
< 0.1%
0.1 3
 
0.1%
ValueCountFrequency (%)
1225980.0 1
< 0.1%
157170.0 1
< 0.1%
151405.502 1
< 0.1%
149054.8 1
< 0.1%
101519.0 1
< 0.1%
67901.0 1
< 0.1%
65437.25 1
< 0.1%
65376.91 1
< 0.1%
58803.15 1
< 0.1%
55733.51 1
< 0.1%

건설폐기물
Real number (ℝ)

SKEWED  ZEROS 

Distinct391
Distinct (%)10.0%
Missing3
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean1708.8989
Minimum0
Maximum1049565
Zeros3508
Zeros (%)89.9%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2023-12-12T20:27:27.876221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile525.68
Maximum1049565
Range1049565
Interquartile range (IQR)0

Descriptive statistics

Standard deviation31357.715
Coefficient of variation (CV)18.349661
Kurtosis695.60869
Mean1708.8989
Median Absolute Deviation (MAD)0
Skewness25.49967
Sum6662996.6
Variance9.8330626 × 108
MonotonicityNot monotonic
2023-12-12T20:27:28.158584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 3508
89.9%
4.85 2
 
0.1%
531.62 1
 
< 0.1%
22326.0 1
 
< 0.1%
296.73 1
 
< 0.1%
118050.0 1
 
< 0.1%
2194.47 1
 
< 0.1%
1999.12 1
 
< 0.1%
3105.42 1
 
< 0.1%
268.5 1
 
< 0.1%
Other values (381) 381
 
9.8%
(Missing) 3
 
0.1%
ValueCountFrequency (%)
0.0 3508
89.9%
3.09 1
 
< 0.1%
3.22 1
 
< 0.1%
3.956 1
 
< 0.1%
4.0 1
 
< 0.1%
4.5 1
 
< 0.1%
4.7 1
 
< 0.1%
4.85 2
 
0.1%
5.23 1
 
< 0.1%
5.62 1
 
< 0.1%
ValueCountFrequency (%)
1049565.0 1
< 0.1%
869806.0 1
< 0.1%
762688.4 1
< 0.1%
709664.093 1
< 0.1%
643777.0 1
< 0.1%
552958.86 1
< 0.1%
264258.0 1
< 0.1%
188451.623 1
< 0.1%
134852.0 1
< 0.1%
118050.0 1
< 0.1%

폐기물발생량총량
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct3593
Distinct (%)92.2%
Missing3
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean18843.004
Minimum0
Maximum11942624
Zeros74
Zeros (%)1.9%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2023-12-12T20:27:28.386582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.799316
Q137.78125
median382
Q33067.875
95-th percentile33341.402
Maximum11942624
Range11942624
Interquartile range (IQR)3030.0938

Descriptive statistics

Standard deviation294453.84
Coefficient of variation (CV)15.626693
Kurtosis1226.1099
Mean18843.004
Median Absolute Deviation (MAD)377.51
Skewness33.960775
Sum73468873
Variance8.6703062 × 1010
MonotonicityNot monotonic
2023-12-12T20:27:28.644079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 74
 
1.9%
15.0 11
 
0.3%
2.5 7
 
0.2%
12.0 7
 
0.2%
37.5 7
 
0.2%
5.0 6
 
0.2%
42.0 6
 
0.2%
24.0 6
 
0.2%
1.5 6
 
0.2%
0.75 6
 
0.2%
Other values (3583) 3763
96.4%
ValueCountFrequency (%)
0.0 74
1.9%
0.015 1
 
< 0.1%
0.0349 1
 
< 0.1%
0.035 1
 
< 0.1%
0.05 3
 
0.1%
0.06 1
 
< 0.1%
0.075 2
 
0.1%
0.1 1
 
< 0.1%
0.1125 1
 
< 0.1%
0.12 1
 
< 0.1%
ValueCountFrequency (%)
11942624.05 1
< 0.1%
9863990.13 1
< 0.1%
8938540.0 1
< 0.1%
1530007.0 1
< 0.1%
1434619.0 1
< 0.1%
1298018.7 1
< 0.1%
1260573.0 1
< 0.1%
1225980.35 1
< 0.1%
1181953.78 1
< 0.1%
1049689.0 1
< 0.1%

Interactions

2023-12-12T20:27:18.631365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:16.526857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.193832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.876188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:18.833358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:16.696467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.358508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:18.053321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:18.991720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:16.853257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.535256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:18.252163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:19.673061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.014802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:17.697949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:27:18.425801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:27:28.817497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업장구분유형특성업종세부업종일반폐기물지정폐기물건설폐기물폐기물발생량총량
사업장구분1.0000.1180.7270.5600.5940.0000.0000.0000.003
유형0.1181.0000.8490.9510.9770.0480.0000.0990.037
특성0.7270.8491.0000.8010.8480.0000.0820.1720.000
업종0.5600.9510.8011.0001.0000.0000.0000.3720.000
세부업종0.5940.9770.8481.0001.0000.0000.1060.3600.000
일반폐기물0.0000.0480.0000.0000.0001.0000.0000.0001.000
지정폐기물0.0000.0000.0820.0000.1060.0001.0000.0000.384
건설폐기물0.0000.0990.1720.3720.3600.0000.0001.0000.000
폐기물발생량총량0.0030.0370.0000.0000.0001.0000.3840.0001.000
2023-12-12T20:27:29.052406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업장구분업종유형특성
사업장구분1.0000.4450.0850.576
업종0.4451.0000.8220.424
유형0.0850.8221.0000.503
특성0.5760.4240.5031.000
2023-12-12T20:27:29.221378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일반폐기물지정폐기물건설폐기물폐기물발생량총량사업장구분유형특성업종
일반폐기물1.0000.4620.0840.9380.0000.0320.0000.000
지정폐기물0.4621.0000.1720.5640.0000.0000.0370.000
건설폐기물0.0840.1721.0000.2440.0000.0550.0730.157
폐기물발생량총량0.9380.5640.2441.0000.0040.0250.0000.000
사업장구분0.0000.0000.0000.0041.0000.0850.5760.445
유형0.0320.0000.0550.0250.0851.0000.5030.822
특성0.0000.0370.0730.0000.5760.5031.0000.424
업종0.0000.0000.1570.0000.4450.8220.4241.000

Missing values

2023-12-12T20:27:19.956372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:27:20.301024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T20:27:20.524678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대표사업장코드대표사업장명사업장명사업장구분유형특성업종세부업종년도일반폐기물지정폐기물건설폐기물폐기물발생량총량
000000000000000095296한국지엠(주) 부평공장 (본사)한국지엠(주) 부평공장 (본사)대표사업장제조배출권할당대상업체제조업자동차 및 트레일러 제조업20217997.01230.00.09227.0
100000000000000095296한국지엠(주) 부평공장 (본사)한국지엠(주) 보령공장사업장제조배출권할당대상업체제조업자동차 및 트레일러 제조업2021545.57291.2210.0836.791
200000000000000095296한국지엠(주) 부평공장 (본사)한국지엠(주) 창원공장사업장제조배출권할당대상업체제조업자동차 및 트레일러 제조업2021615.44853.140.01468.58
300000000000000095297ROHMKOREA 대전공장ROHMKOREA 대전공장대표사업장제조온실가스목표관리업체제조업전자부품, 컴퓨터, 영상, 음향 및 통신장비 제조업2021588.28333.860.0622.143
400000000000000095300롯데정밀화학(주)롯데정밀화학(주)대표사업장제조주권상장법인제조업화학물질 및 화학제품 제조업;의약품 제외202181938.51650.72325.7382914.96
500000000000000095300롯데정밀화학(주)롯데정밀화학(주) 인천사업장제조배출권할당대상업체제조업화학물질 및 화학제품 제조업;의약품 제외2021329.860.790.0330.65
600000000000000095311(주)휴비스 전주공장(주)휴비스 전주공장대표사업장제조배출권할당대상업체제조업화학물질 및 화학제품 제조업;의약품 제외202127018.782326.79531.6229877.19
700000000000000095311(주)휴비스 전주공장(주)휴비스사업장제조배출권할당대상업체제조업섬유제품 제조업; 의복제외202140.6080.00.040.608
800000000000000095314코카콜라음료(주) 여주공장코카콜라음료(주) 여주공장대표사업장제조온실가스목표관리업체제조업음료 제조업20212802.9314.350.02817.28
900000000000000095314코카콜라음료(주) 여주공장코카콜라음료(주) 광주공장사업장제조녹색기업제조업음료 제조업20212281.881.610.02283.49
대표사업장코드대표사업장명사업장명사업장구분유형특성업종세부업종년도일반폐기물지정폐기물건설폐기물폐기물발생량총량
3892CT000000000000054193삼성물산(주)삼성물산(주)대표사업장기타산업주권상장법인건설업종합 건설업20213.0160.00.03.016
3893CT000000000000054193삼성물산(주)삼성물산(주)건설_강동ECT사업장기타산업배출권할당대상업체건설업종합 건설업2021247.680.00.0247.68
3894CT000000000000054193삼성물산(주)삼성물산(주)건설_건설현장사업장기타산업배출권할당대상업체건설업종합 건설업20211305.6528.92552958.86554293.43
3895CT000000000000054193삼성물산(주)삼성물산(주)건설_본사사업장기타산업배출권할당대상업체건설업종합 건설업202191.4011.08850.092.4895
3896CT000000000000054193삼성물산(주)삼성물산(주)에버랜드리조트사업장기타산업녹색기업농업,임업 및 어업임업20216516.0499.954643.5511259.54
3897CT000000000000054193삼성물산(주)삼성물산(주)패션_직물사업장기타산업배출권할당대상업체제조업섬유제품 제조업; 의복제외202153.23.760.056.96
3898CT000000000000054195대웅제약대웅제약대표사업장제조기타제조업의료용 물질 및 의약품 제조업2021271.4920.760.0292.25
3899CT000000000000054196주식회사 덕양에너젠주식회사 덕양에너젠대표사업장제조배출권할당대상업체제조업화학물질 및 화학제품 제조업;의약품 제외20211.171.380.02.55
3900CT000000000000054269동성화물자동차(주)동성화물자동차(주)대표사업장기타서비스온실가스목표관리업체운수업육상운송 및 파이프라인 운송업20216.1750.00.06.175
3901CT000000000000054274금호고속금호고속대표사업장기타서비스배출권할당대상업체운수업육상운송 및 파이프라인 운송업2021375.1968.7860.0443.976