Overview

Dataset statistics

Number of variables3
Number of observations645
Missing cells4
Missing cells (%)0.2%
Duplicate rows60
Duplicate rows (%)9.3%
Total size in memory15.2 KiB
Average record size in memory24.2 B

Variable types

Categorical1
Text2

Dataset

Description공공데이터 등록으로 사업장폐기물 + 지정폐기물(의료폐기물) 신고내역(강원도 홍천군)을 등록합니다. 사업장명 신고연도 배출폐기물종류 등 구분하여 등록합니다.
Author강원도 홍천군
URLhttps://www.data.go.kr/data/15081067/fileData.do

Alerts

Dataset has 60 (9.3%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-18 01:57:49.146539
Analysis finished2024-04-18 01:57:50.634465
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct28
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2001
61 
2012
56 
2014
45 
2011
45 
2015
45 
Other values (23)
393 

Length

Max length6
Median length4
Mean length4.0031008
Min length4

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row2021
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2001 61
 
9.5%
2012 56
 
8.7%
2014 45
 
7.0%
2011 45
 
7.0%
2015 45
 
7.0%
2013 40
 
6.2%
2010 38
 
5.9%
2000 37
 
5.7%
2019 37
 
5.7%
2009 34
 
5.3%
Other values (18) 207
32.1%

Length

2024-04-18T10:57:50.694685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2001 61
 
9.5%
2012 56
 
8.7%
2014 45
 
7.0%
2011 45
 
7.0%
2015 45
 
7.0%
2013 40
 
6.2%
2010 38
 
5.9%
2000 37
 
5.7%
2019 37
 
5.7%
2009 34
 
5.3%
Other values (18) 207
32.1%

상호
Text

Distinct234
Distinct (%)36.4%
Missing2
Missing (%)0.3%
Memory size5.2 KiB
2024-04-18T10:57:50.882891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length8.3328149
Min length2

Characters and Unicode

Total characters5358
Distinct characters283
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)15.6%

Sample

1st row강원사료
2nd row(주)생그린식품 홍천지점
3rd row뫼내뜰영농조합법인식품공장
4th row세이지우드 홍천
5th row사람과안전건설화재에너지연구원
ValueCountFrequency (%)
하이트진로(주)강원공장 23
 
3.2%
주식회사 21
 
3.0%
주)소노인터내셔널 19
 
2.7%
홍천군청 16
 
2.3%
주)씨티씨바이오 11
 
1.5%
보림개발(주 10
 
1.4%
환경시설관리주식회사 10
 
1.4%
강원도시가스(주 9
 
1.3%
시스템면역의학연구소 8
 
1.1%
합)강원환경 8
 
1.1%
Other values (236) 576
81.0%
2024-04-18T10:57:51.189012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
269
 
5.0%
( 266
 
5.0%
) 266
 
5.0%
195
 
3.6%
166
 
3.1%
114
 
2.1%
112
 
2.1%
110
 
2.1%
110
 
2.1%
94
 
1.8%
Other values (273) 3656
68.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4659
87.0%
Open Punctuation 266
 
5.0%
Close Punctuation 266
 
5.0%
Decimal Number 76
 
1.4%
Space Separator 68
 
1.3%
Uppercase Letter 18
 
0.3%
Other Symbol 4
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
269
 
5.8%
195
 
4.2%
166
 
3.6%
114
 
2.4%
112
 
2.4%
110
 
2.4%
110
 
2.4%
94
 
2.0%
93
 
2.0%
79
 
1.7%
Other values (252) 3317
71.2%
Decimal Number
ValueCountFrequency (%)
1 17
22.4%
5 13
17.1%
7 13
17.1%
9 11
14.5%
3 10
13.2%
2 5
 
6.6%
6 4
 
5.3%
0 3
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
O 7
38.9%
G 3
16.7%
C 2
 
11.1%
M 2
 
11.1%
E 1
 
5.6%
S 1
 
5.6%
K 1
 
5.6%
N 1
 
5.6%
Open Punctuation
ValueCountFrequency (%)
( 266
100.0%
Close Punctuation
ValueCountFrequency (%)
) 266
100.0%
Space Separator
ValueCountFrequency (%)
68
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4663
87.0%
Common 677
 
12.6%
Latin 18
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
269
 
5.8%
195
 
4.2%
166
 
3.6%
114
 
2.4%
112
 
2.4%
110
 
2.4%
110
 
2.4%
94
 
2.0%
93
 
2.0%
79
 
1.7%
Other values (253) 3321
71.2%
Common
ValueCountFrequency (%)
( 266
39.3%
) 266
39.3%
68
 
10.0%
1 17
 
2.5%
5 13
 
1.9%
7 13
 
1.9%
9 11
 
1.6%
3 10
 
1.5%
2 5
 
0.7%
6 4
 
0.6%
Other values (2) 4
 
0.6%
Latin
ValueCountFrequency (%)
O 7
38.9%
G 3
16.7%
C 2
 
11.1%
M 2
 
11.1%
E 1
 
5.6%
S 1
 
5.6%
K 1
 
5.6%
N 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4659
87.0%
ASCII 695
 
13.0%
None 4
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
269
 
5.8%
195
 
4.2%
166
 
3.6%
114
 
2.4%
112
 
2.4%
110
 
2.4%
110
 
2.4%
94
 
2.0%
93
 
2.0%
79
 
1.7%
Other values (252) 3317
71.2%
ASCII
ValueCountFrequency (%)
( 266
38.3%
) 266
38.3%
68
 
9.8%
1 17
 
2.4%
5 13
 
1.9%
7 13
 
1.9%
9 11
 
1.6%
3 10
 
1.4%
O 7
 
1.0%
2 5
 
0.7%
Other values (10) 19
 
2.7%
None
ValueCountFrequency (%)
4
100.0%
Distinct89
Distinct (%)13.8%
Missing2
Missing (%)0.3%
Memory size5.2 KiB
2024-04-18T10:57:51.413055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length79
Mean length11.981337
Min length2

Characters and Unicode

Total characters7704
Distinct characters226
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)4.7%

Sample

1st row그 밖의 유기성오니
2nd row그 밖의 폐수처리오니
3rd row그 밖의 식물성잔재물
4th row임목폐목재(건설공사_ 산지개간 등의 과정에서 발생된 나무뿌리_ 가지_ 줄기 등을 말한다)
5th row폐합성수지류(폐염화비닐수지류는 제외한다)
ValueCountFrequency (%)
제외한다 91
 
6.6%
일반의료폐기물 80
 
5.8%
손상성폐기물 78
 
5.7%
밖의 78
 
5.7%
78
 
5.7%
폐합성수지류(폐염화비닐수지류는 43
 
3.1%
생물ㆍ화학폐기물 43
 
3.1%
병리계폐기물 41
 
3.0%
폐목재류 38
 
2.8%
재활용하는 36
 
2.6%
Other values (180) 772
56.0%
2024-04-18T10:57:51.735841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
737
 
9.6%
634
 
8.2%
471
 
6.1%
358
 
4.6%
220
 
2.9%
197
 
2.6%
174
 
2.3%
156
 
2.0%
154
 
2.0%
154
 
2.0%
Other values (216) 4449
57.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6607
85.8%
Space Separator 737
 
9.6%
Close Punctuation 124
 
1.6%
Open Punctuation 124
 
1.6%
Decimal Number 54
 
0.7%
Connector Punctuation 35
 
0.5%
Lowercase Letter 12
 
0.2%
Other Punctuation 10
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
634
 
9.6%
471
 
7.1%
358
 
5.4%
220
 
3.3%
197
 
3.0%
174
 
2.6%
156
 
2.4%
154
 
2.3%
154
 
2.3%
149
 
2.3%
Other values (197) 3940
59.6%
Lowercase Letter
ValueCountFrequency (%)
e 4
33.3%
r 2
16.7%
g 2
16.7%
a 2
16.7%
s 2
16.7%
Decimal Number
ValueCountFrequency (%)
1 34
63.0%
2 15
27.8%
8 3
 
5.6%
0 2
 
3.7%
Close Punctuation
ValueCountFrequency (%)
) 117
94.4%
5
 
4.0%
] 2
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 117
94.4%
5
 
4.0%
[ 2
 
1.6%
Space Separator
ValueCountFrequency (%)
737
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 35
100.0%
Other Punctuation
ValueCountFrequency (%)
. 10
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6605
85.7%
Common 1085
 
14.1%
Latin 12
 
0.2%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
634
 
9.6%
471
 
7.1%
358
 
5.4%
220
 
3.3%
197
 
3.0%
174
 
2.6%
156
 
2.4%
154
 
2.3%
154
 
2.3%
149
 
2.3%
Other values (195) 3938
59.6%
Common
ValueCountFrequency (%)
737
67.9%
) 117
 
10.8%
( 117
 
10.8%
_ 35
 
3.2%
1 34
 
3.1%
2 15
 
1.4%
. 10
 
0.9%
5
 
0.5%
5
 
0.5%
8 3
 
0.3%
Other values (4) 7
 
0.6%
Latin
ValueCountFrequency (%)
e 4
33.3%
r 2
16.7%
g 2
16.7%
a 2
16.7%
s 2
16.7%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6523
84.7%
ASCII 1087
 
14.1%
Compat Jamo 82
 
1.1%
None 10
 
0.1%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
737
67.8%
) 117
 
10.8%
( 117
 
10.8%
_ 35
 
3.2%
1 34
 
3.1%
2 15
 
1.4%
. 10
 
0.9%
e 4
 
0.4%
8 3
 
0.3%
] 2
 
0.2%
Other values (7) 13
 
1.2%
Hangul
ValueCountFrequency (%)
634
 
9.7%
471
 
7.2%
358
 
5.5%
220
 
3.4%
197
 
3.0%
174
 
2.7%
156
 
2.4%
154
 
2.4%
154
 
2.4%
149
 
2.3%
Other values (194) 3856
59.1%
Compat Jamo
ValueCountFrequency (%)
82
100.0%
None
ValueCountFrequency (%)
5
50.0%
5
50.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

Correlations

2024-04-18T10:57:51.808770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
신고기준연도폐기물 종류(사업장폐기물)
신고기준연도1.0000.926
폐기물 종류(사업장폐기물)0.9261.000

Missing values

2024-04-18T10:57:50.509558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T10:57:50.591920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

신고기준연도상호폐기물 종류(사업장폐기물)
02021강원사료그 밖의 유기성오니
12020(주)생그린식품 홍천지점그 밖의 폐수처리오니
22020뫼내뜰영농조합법인식품공장그 밖의 식물성잔재물
32020세이지우드 홍천임목폐목재(건설공사_ 산지개간 등의 과정에서 발생된 나무뿌리_ 가지_ 줄기 등을 말한다)
42020사람과안전건설화재에너지연구원폐합성수지류(폐염화비닐수지류는 제외한다)
52020사람과안전건설화재에너지연구원폐벽돌
62020힐드로사이 주식회사폐합성수지류(폐염화비닐수지류는 제외한다)
72019탁자원폐합성수지류(폐염화비닐수지류는 제외한다)
82019세이지우드 홍천폐합성수지류(폐염화비닐수지류는 제외한다)
92019(주)권선개발석재ㆍ골재폐수처리오니(석재ㆍ골재 생산 시 발생한 폐수를 처리하는 과정에서 발생한 오니로 한정한다)
신고기준연도상호폐기물 종류(사업장폐기물)
6352000내촌보건지소생물ㆍ화학폐기물
6362000내촌보건지소손상성폐기물
6372000내촌보건지소병리계폐기물
6382000내촌보건지소조직물류폐기물(태반을 재활용하는 경우는 제외한다)
6392000내촌보건지소일반의료폐기물
6402000홍천군보건소병리계폐기물
6412000홍천군보건소손상성폐기물
6422000홍천군보건소생물ㆍ화학폐기물
6432000홍천군보건소일반의료폐기물
6442000홍천군보건소병리계폐기물

Duplicate rows

Most frequently occurring

신고기준연도상호폐기물 종류(사업장폐기물)# duplicates
11997하이트진로(주)강원공장그 밖의 폐수처리오니8
21997하이트진로(주)강원공장폐합성수지류(폐염화비닐수지류는 제외한다)7
102002환경관리 주식회사하수처리오니6
472015환경시설관리주식회사분뇨처리오니6
92002(합자)홍천환경산업폐합성수지류(폐염화비닐수지류는 제외한다)5
442015강원도시가스(주)가축분뇨처리오니5
132006(주)소노인터내셔널하수처리오니4
172010(합)강원환경그 밖의 폐목재류4
182010(합)강원환경폐합성수지류(폐염화비닐수지류는 제외한다)4
82002(합자)홍천환경산업그 밖의 폐목재류3