Overview

Dataset statistics

Number of variables4
Number of observations3349
Missing cells0
Missing cells (%)0.0%
Duplicate rows607
Duplicate rows (%)18.1%
Total size in memory104.8 KiB
Average record size in memory32.0 B

Variable types

Categorical1
Text3

Dataset

Description김해시 사업장 폐기물 배출자 신고현황에 대한 데이터로 폐기물 구분, 업체명, 주소, 폐기물 종류의 항목을 제공합니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15060327

Alerts

Dataset has 607 (18.1%) duplicate rowsDuplicates

Reproduction

Analysis started2024-03-13 00:11:31.134728
Analysis finished2024-03-13 00:11:31.613914
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

폐기물구분
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.3 KiB
사업장일반폐기물
2265 
지정폐기물
1084 

Length

Max length8
Median length8
Mean length7.0289639
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사업장일반폐기물
2nd row사업장일반폐기물
3rd row사업장일반폐기물
4th row사업장일반폐기물
5th row사업장일반폐기물

Common Values

ValueCountFrequency (%)
사업장일반폐기물 2265
67.6%
지정폐기물 1084
32.4%

Length

2024-03-13T09:11:31.676317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T09:11:31.753273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사업장일반폐기물 2265
67.6%
지정폐기물 1084
32.4%

상호
Text

Distinct1338
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size26.3 KiB
2024-03-13T09:11:31.909318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length24
Mean length8.839355
Min length1

Characters and Unicode

Total characters29603
Distinct characters469
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique688 ?
Unique (%)20.5%

Sample

1st row(주)코스트코 코리아(김해점)
2nd row(주)코스트코 코리아(김해점)
3rd row(주)코스트코 코리아(김해점)
4th row(주)코스트코 코리아(김해점)
5th row(주)코스트코 코리아(김해점)
ValueCountFrequency (%)
주식회사 109
 
2.7%
한국자동차환경협회 50
 
1.2%
롯데쇼핑(주 35
 
0.9%
의료법인 34
 
0.8%
김해공장 29
 
0.7%
인제대학교 27
 
0.7%
이은요양병원 26
 
0.6%
의료폐기물공동운영기구 24
 
0.6%
김해시의사회 24
 
0.6%
김해지점 20
 
0.5%
Other values (1406) 3717
90.8%
2024-03-13T09:11:32.198416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2017
 
6.8%
( 1974
 
6.7%
) 1974
 
6.7%
750
 
2.5%
580
 
2.0%
577
 
1.9%
538
 
1.8%
538
 
1.8%
502
 
1.7%
477
 
1.6%
Other values (459) 19676
66.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24395
82.4%
Open Punctuation 1974
 
6.7%
Close Punctuation 1974
 
6.7%
Space Separator 750
 
2.5%
Uppercase Letter 363
 
1.2%
Decimal Number 70
 
0.2%
Lowercase Letter 35
 
0.1%
Other Punctuation 25
 
0.1%
Dash Punctuation 15
 
0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2017
 
8.3%
580
 
2.4%
577
 
2.4%
538
 
2.2%
538
 
2.2%
502
 
2.1%
477
 
2.0%
418
 
1.7%
413
 
1.7%
407
 
1.7%
Other values (415) 17928
73.5%
Uppercase Letter
ValueCountFrequency (%)
C 66
18.2%
S 30
 
8.3%
T 27
 
7.4%
E 26
 
7.2%
N 24
 
6.6%
M 24
 
6.6%
K 24
 
6.6%
H 22
 
6.1%
R 21
 
5.8%
B 15
 
4.1%
Other values (14) 84
23.1%
Decimal Number
ValueCountFrequency (%)
2 30
42.9%
1 17
24.3%
7 8
 
11.4%
5 7
 
10.0%
3 5
 
7.1%
6 3
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
o 10
28.6%
p 5
14.3%
n 5
14.3%
m 5
14.3%
a 5
14.3%
y 5
14.3%
Other Punctuation
ValueCountFrequency (%)
. 18
72.0%
& 6
 
24.0%
/ 1
 
4.0%
Open Punctuation
ValueCountFrequency (%)
( 1974
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1974
100.0%
Space Separator
ValueCountFrequency (%)
750
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24395
82.4%
Common 4810
 
16.2%
Latin 398
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2017
 
8.3%
580
 
2.4%
577
 
2.4%
538
 
2.2%
538
 
2.2%
502
 
2.1%
477
 
2.0%
418
 
1.7%
413
 
1.7%
407
 
1.7%
Other values (415) 17928
73.5%
Latin
ValueCountFrequency (%)
C 66
16.6%
S 30
 
7.5%
T 27
 
6.8%
E 26
 
6.5%
N 24
 
6.0%
M 24
 
6.0%
K 24
 
6.0%
H 22
 
5.5%
R 21
 
5.3%
B 15
 
3.8%
Other values (20) 119
29.9%
Common
ValueCountFrequency (%)
( 1974
41.0%
) 1974
41.0%
750
 
15.6%
2 30
 
0.6%
. 18
 
0.4%
1 17
 
0.4%
- 15
 
0.3%
7 8
 
0.2%
5 7
 
0.1%
& 6
 
0.1%
Other values (4) 11
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24395
82.4%
ASCII 5208
 
17.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2017
 
8.3%
580
 
2.4%
577
 
2.4%
538
 
2.2%
538
 
2.2%
502
 
2.1%
477
 
2.0%
418
 
1.7%
413
 
1.7%
407
 
1.7%
Other values (415) 17928
73.5%
ASCII
ValueCountFrequency (%)
( 1974
37.9%
) 1974
37.9%
750
 
14.4%
C 66
 
1.3%
S 30
 
0.6%
2 30
 
0.6%
T 27
 
0.5%
E 26
 
0.5%
N 24
 
0.5%
M 24
 
0.5%
Other values (34) 283
 
5.4%
Distinct1183
Distinct (%)35.3%
Missing0
Missing (%)0.0%
Memory size26.3 KiB
2024-03-13T09:11:32.412615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length38
Mean length23.581368
Min length1

Characters and Unicode

Total characters78974
Distinct characters199
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique587 ?
Unique (%)17.5%

Sample

1st row경상남도 김해시 주촌면 선천남로 16
2nd row경상남도 김해시 주촌면 선천남로 16
3rd row경상남도 김해시 주촌면 선천남로 16
4th row경상남도 김해시 주촌면 선천남로 16
5th row경상남도 김해시 주촌면 선천남로 16
ValueCountFrequency (%)
김해시 3145
19.5%
경상남도 3131
19.4%
한림면 570
 
3.5%
주촌면 451
 
2.8%
진영읍 377
 
2.3%
진례면 293
 
1.8%
생림면 283
 
1.8%
상동면 253
 
1.6%
김해대로 201
 
1.2%
안동 121
 
0.7%
Other values (1153) 7314
45.3%
2024-03-13T09:11:32.732727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13299
 
16.8%
3741
 
4.7%
3741
 
4.7%
3524
 
4.5%
3159
 
4.0%
3142
 
4.0%
3140
 
4.0%
3131
 
4.0%
3131
 
4.0%
1 3061
 
3.9%
Other values (189) 35905
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46401
58.8%
Decimal Number 15900
 
20.1%
Space Separator 13299
 
16.8%
Dash Punctuation 1213
 
1.5%
Open Punctuation 918
 
1.2%
Close Punctuation 918
 
1.2%
Connector Punctuation 274
 
0.3%
Uppercase Letter 21
 
< 0.1%
Other Punctuation 15
 
< 0.1%
Math Symbol 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3741
 
8.1%
3741
 
8.1%
3524
 
7.6%
3159
 
6.8%
3142
 
6.8%
3140
 
6.8%
3131
 
6.7%
3131
 
6.7%
1885
 
4.1%
1726
 
3.7%
Other values (167) 16081
34.7%
Decimal Number
ValueCountFrequency (%)
1 3061
19.3%
2 2107
13.3%
3 1816
11.4%
5 1503
9.5%
4 1448
9.1%
6 1392
8.8%
9 1293
8.1%
7 1212
 
7.6%
0 1185
 
7.5%
8 883
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
B 8
38.1%
L 8
38.1%
T 2
 
9.5%
K 2
 
9.5%
F 1
 
4.8%
Space Separator
ValueCountFrequency (%)
13299
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1213
100.0%
Open Punctuation
ValueCountFrequency (%)
( 918
100.0%
Close Punctuation
ValueCountFrequency (%)
) 918
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 274
100.0%
Other Punctuation
ValueCountFrequency (%)
: 15
100.0%
Math Symbol
ValueCountFrequency (%)
~ 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46401
58.8%
Common 32552
41.2%
Latin 21
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3741
 
8.1%
3741
 
8.1%
3524
 
7.6%
3159
 
6.8%
3142
 
6.8%
3140
 
6.8%
3131
 
6.7%
3131
 
6.7%
1885
 
4.1%
1726
 
3.7%
Other values (167) 16081
34.7%
Common
ValueCountFrequency (%)
13299
40.9%
1 3061
 
9.4%
2 2107
 
6.5%
3 1816
 
5.6%
5 1503
 
4.6%
4 1448
 
4.4%
6 1392
 
4.3%
9 1293
 
4.0%
- 1213
 
3.7%
7 1212
 
3.7%
Other values (7) 4208
 
12.9%
Latin
ValueCountFrequency (%)
B 8
38.1%
L 8
38.1%
T 2
 
9.5%
K 2
 
9.5%
F 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46401
58.8%
ASCII 32573
41.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13299
40.8%
1 3061
 
9.4%
2 2107
 
6.5%
3 1816
 
5.6%
5 1503
 
4.6%
4 1448
 
4.4%
6 1392
 
4.3%
9 1293
 
4.0%
- 1213
 
3.7%
7 1212
 
3.7%
Other values (12) 4229
 
13.0%
Hangul
ValueCountFrequency (%)
3741
 
8.1%
3741
 
8.1%
3524
 
7.6%
3159
 
6.8%
3142
 
6.8%
3140
 
6.8%
3131
 
6.7%
3131
 
6.7%
1885
 
4.1%
1726
 
3.7%
Other values (167) 16081
34.7%
Distinct172
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size26.3 KiB
2024-03-13T09:11:32.953106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length66
Mean length16.120036
Min length1

Characters and Unicode

Total characters53986
Distinct characters267
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)1.1%

Sample

1st row폐합성수지류(폐염화비닐수지류는 제외한다)
2nd row축산물가공잔재물(동물성 유지류는 제외한다)
3rd row자동차 폐타이어
4th row음식물류폐기물
5th row폐합성수지류(폐염화비닐수지류는 제외한다)
ValueCountFrequency (%)
754
 
8.2%
밖의 751
 
8.2%
제외한다 640
 
7.0%
폐합성수지류(폐염화비닐수지류는 505
 
5.5%
폐유 318
 
3.5%
말한다 304
 
3.3%
259
 
2.8%
등을 192
 
2.1%
분진 179
 
1.9%
폐광물유[아스팔트유ㆍ그리스(grease)ㆍ방청유 142
 
1.5%
Other values (271) 5148
56.0%
2024-03-13T09:11:33.321854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6084
 
11.3%
3791
 
7.0%
1946
 
3.6%
1679
 
3.1%
1563
 
2.9%
1541
 
2.9%
1375
 
2.5%
1297
 
2.4%
1191
 
2.2%
1090
 
2.0%
Other values (257) 32429
60.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43800
81.1%
Space Separator 6084
 
11.3%
Open Punctuation 1157
 
2.1%
Close Punctuation 1157
 
2.1%
Lowercase Letter 852
 
1.6%
Connector Punctuation 468
 
0.9%
Decimal Number 446
 
0.8%
Other Punctuation 22
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3791
 
8.7%
1946
 
4.4%
1679
 
3.8%
1563
 
3.6%
1541
 
3.5%
1375
 
3.1%
1297
 
3.0%
1191
 
2.7%
1090
 
2.5%
1082
 
2.5%
Other values (235) 27245
62.2%
Decimal Number
ValueCountFrequency (%)
2 219
49.1%
0 142
31.8%
1 46
 
10.3%
8 36
 
8.1%
4 1
 
0.2%
7 1
 
0.2%
3 1
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
e 284
33.3%
r 142
16.7%
a 142
16.7%
g 142
16.7%
s 142
16.7%
Open Punctuation
ValueCountFrequency (%)
( 979
84.6%
[ 142
 
12.3%
36
 
3.1%
Close Punctuation
ValueCountFrequency (%)
) 979
84.6%
] 142
 
12.3%
36
 
3.1%
Other Punctuation
ValueCountFrequency (%)
· 20
90.9%
. 2
 
9.1%
Space Separator
ValueCountFrequency (%)
6084
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 468
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43800
81.1%
Common 9334
 
17.3%
Latin 852
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3791
 
8.7%
1946
 
4.4%
1679
 
3.8%
1563
 
3.6%
1541
 
3.5%
1375
 
3.1%
1297
 
3.0%
1191
 
2.7%
1090
 
2.5%
1082
 
2.5%
Other values (235) 27245
62.2%
Common
ValueCountFrequency (%)
6084
65.2%
( 979
 
10.5%
) 979
 
10.5%
_ 468
 
5.0%
2 219
 
2.3%
[ 142
 
1.5%
] 142
 
1.5%
0 142
 
1.5%
1 46
 
0.5%
8 36
 
0.4%
Other values (7) 97
 
1.0%
Latin
ValueCountFrequency (%)
e 284
33.3%
r 142
16.7%
a 142
16.7%
g 142
16.7%
s 142
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 43000
79.7%
ASCII 10094
 
18.7%
Compat Jamo 800
 
1.5%
None 92
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6084
60.3%
( 979
 
9.7%
) 979
 
9.7%
_ 468
 
4.6%
e 284
 
2.8%
2 219
 
2.2%
[ 142
 
1.4%
r 142
 
1.4%
] 142
 
1.4%
a 142
 
1.4%
Other values (9) 513
 
5.1%
Hangul
ValueCountFrequency (%)
3791
 
8.8%
1946
 
4.5%
1679
 
3.9%
1563
 
3.6%
1541
 
3.6%
1375
 
3.2%
1297
 
3.0%
1191
 
2.8%
1090
 
2.5%
1082
 
2.5%
Other values (234) 26445
61.5%
Compat Jamo
ValueCountFrequency (%)
800
100.0%
None
ValueCountFrequency (%)
36
39.1%
36
39.1%
· 20
21.7%

Missing values

2024-03-13T09:11:31.520581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T09:11:31.584347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

폐기물구분상호사업장도로명주소폐기물 종류
0사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16폐합성수지류(폐염화비닐수지류는 제외한다)
1사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16축산물가공잔재물(동물성 유지류는 제외한다)
2사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16자동차 폐타이어
3사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16음식물류폐기물
4사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16폐합성수지류(폐염화비닐수지류는 제외한다)
5사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16폐합성수지류(폐염화비닐수지류는 제외한다)
6사업장일반폐기물(주)코스트코 코리아(김해점)경상남도 김해시 주촌면 선천남로 16폐발포합성수지
7사업장일반폐기물(주)엘엑스하우시스경상남도 김해시 진영읍 하계로240번길 209-40그 밖의 폐목재류
8사업장일반폐기물신에이코리아 주식회사경상남도 김해시 진영읍 본산로193번길 47폐토사
9사업장일반폐기물주식회사 석지 김해경상남도 김해시 주촌면 서부로 1474-54폐합성수지류(폐염화비닐수지류는 제외한다)
폐기물구분상호사업장도로명주소폐기물 종류
3339지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐황산이 포함된 2차폐축전지
3340지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)그 밖의 폐유
3341지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐윤활유(「자원의 절약과 재활용촉진에 관한 법률 시행령」 제18조에 따른 재활용의무 대상 제품ㆍ포장재인 기어유 및 내연기관용 윤활유를 말한다)
3342지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐기계유ㆍ폐작동유(공업용 기계유ㆍ냉동기유ㆍ터어빈유ㆍ베어링윤활유ㆍ압축기유ㆍ유압작동유ㆍ열매체유 및 프로세스유 등을 말한다)
3343지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐황산이 포함된 2차폐축전지
3344지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐오일필터
3345지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)그 밖의 폐광물유[아스팔트유ㆍ그리스(grease)ㆍ방청유 및 수용성절삭유_ 20퍼센트 이상의 이물질이 함유된 폐유_ 고체상태의 폐유 등을 말한다]
3346지정폐기물현대자동차(주) 서부산하이테크센터경상남도 김해시 김해대로 2736 (지내동)폐오일필터
3347지정폐기물중국한의원경상남도 김해시 김해대로2325번길 22 (부원동)일반의료폐기물
3348지정폐기물중국한의원경상남도 김해시 김해대로2325번길 22 (부원동)손상성폐기물

Duplicate rows

Most frequently occurring

폐기물구분상호사업장도로명주소폐기물 종류# duplicates
583지정폐기물한국자동차환경협회경상남도 김해시 주촌면 서부로1403번길 67-52그 밖의 광물류15
586지정폐기물한국자동차환경협회경상남도 김해시 주촌면 서부로1403번길 67-52윤활유12
303사업장일반폐기물한국자동차환경경상남도 김해시 주촌면 서부로1403번길 67-52폐합성수지류9
3사업장일반폐기물(주)HKM경상남도 김해시 상동면 묵방로120번길 20그 밖의 분진8
118사업장일반폐기물(주)자연(김해시 음식물류폐기물 자원화처리시설)경상남도 김해시 진영읍 김해대로 832-68중간가공음식물류폐기물8
588지정폐기물한국자동차환경협회경상남도 김해시 주촌면 서부로1403번길 67-52폐오일필터8
310사업장일반폐기물한성기업(주)김해공장경상남도 김해시 삼안로 51 (안동)그 밖의 폐수처리오니7
585지정폐기물한국자동차환경협회경상남도 김해시 주촌면 서부로1403번길 67-52유성페인트7
4사업장일반폐기물(주)SNC경상남도 김해시 유하로 201 (유하동)그 밖의 광재류6
49사업장일반폐기물(주)대하에코텍경상남도 김해시 상동면 묵방로 197폐합성수지류(폐염화비닐수지류는 제외한다)6