Overview

Dataset statistics

Number of variables5
Number of observations143
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.9 KiB
Average record size in memory41.9 B

Variable types

Categorical1
Text3
Numeric1

Dataset

Description올바로시스템에서 운영하는 폐기물 중 지정폐기물에 대한 연도별 배출 사업장 정보현황(시도, 시군구, 업체명, 연락처, 발생량(톤))을 관리하는 데이터입니다.
URLhttps://www.data.go.kr/data/15068827/fileData.do

Alerts

업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 15:39:52.492274
Analysis finished2023-12-12 15:39:53.436338
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct14
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
전라북도
44 
경기도
27 
충청북도
13 
경상남도
11 
인천광역시
Other values (9)
39 

Length

Max length7
Median length4
Mean length3.9370629
Min length3

Unique

Unique2 ?
Unique (%)1.4%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
전라북도 44
30.8%
경기도 27
18.9%
충청북도 13
 
9.1%
경상남도 11
 
7.7%
인천광역시 9
 
6.3%
경상북도 8
 
5.6%
전라남도 8
 
5.6%
충청남도 6
 
4.2%
강원도 5
 
3.5%
울산광역시 4
 
2.8%
Other values (4) 8
 
5.6%

Length

2023-12-13T00:39:53.511642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전라북도 44
30.8%
경기도 27
18.9%
충청북도 13
 
9.1%
경상남도 11
 
7.7%
인천광역시 9
 
6.3%
경상북도 8
 
5.6%
전라남도 8
 
5.6%
충청남도 6
 
4.2%
강원도 5
 
3.5%
울산광역시 4
 
2.8%
Other values (4) 8
 
5.6%
Distinct70
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2023-12-13T00:39:53.766953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.5664336
Min length2

Characters and Unicode

Total characters510
Distinct characters76
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)25.2%

Sample

1st row강릉시
2nd row양구군
3rd row영월군
4th row원주시
5th row정선군
ValueCountFrequency (%)
전주시 9
 
5.4%
장수군 6
 
3.6%
무주군 6
 
3.6%
덕진구 6
 
3.6%
서구 5
 
3.0%
화성시 5
 
3.0%
정읍시 5
 
3.0%
성남시 5
 
3.0%
남원시 4
 
2.4%
수정구 4
 
2.4%
Other values (69) 111
66.9%
2023-12-13T00:39:54.181849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
83
 
16.3%
45
 
8.8%
44
 
8.6%
30
 
5.9%
23
 
4.5%
17
 
3.3%
17
 
3.3%
17
 
3.3%
12
 
2.4%
11
 
2.2%
Other values (66) 211
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 487
95.5%
Space Separator 23
 
4.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
83
17.0%
45
 
9.2%
44
 
9.0%
30
 
6.2%
17
 
3.5%
17
 
3.5%
17
 
3.5%
12
 
2.5%
11
 
2.3%
11
 
2.3%
Other values (65) 200
41.1%
Space Separator
ValueCountFrequency (%)
23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 487
95.5%
Common 23
 
4.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
83
17.0%
45
 
9.2%
44
 
9.0%
30
 
6.2%
17
 
3.5%
17
 
3.5%
17
 
3.5%
12
 
2.5%
11
 
2.3%
11
 
2.3%
Other values (65) 200
41.1%
Common
ValueCountFrequency (%)
23
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 487
95.5%
ASCII 23
 
4.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
83
17.0%
45
 
9.2%
44
 
9.0%
30
 
6.2%
17
 
3.5%
17
 
3.5%
17
 
3.5%
12
 
2.5%
11
 
2.3%
11
 
2.3%
Other values (65) 200
41.1%
ASCII
ValueCountFrequency (%)
23
100.0%

업체명
Text

UNIQUE 

Distinct143
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2023-12-13T00:39:54.418768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length34
Mean length13.20979
Min length4

Characters and Unicode

Total characters1889
Distinct characters300
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)100.0%

Sample

1st row(주)강원건설-강릉시 임영로 161, 임영로179번길 37 석면해체 제거
2nd row양구고 기계실습실 증축 및 보수 지정폐기물 철거공사
3rd row대성엠디아이(주) 라임켐센터
4th row에이치디씨리조트 주식회사
5th row정선군쓰레기위생매립장 비산재 위탁처리용역
ValueCountFrequency (%)
지정폐기물 26
 
8.6%
개인 12
 
4.0%
주식회사 8
 
2.7%
지정폐기물처리 7
 
2.3%
폐기물 5
 
1.7%
처리 4
 
1.3%
폐기물처리 4
 
1.3%
3
 
1.0%
용역 3
 
1.0%
처리공사 2
 
0.7%
Other values (223) 227
75.4%
2023-12-13T00:39:54.828408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
158
 
8.4%
( 72
 
3.8%
72
 
3.8%
) 72
 
3.8%
64
 
3.4%
60
 
3.2%
55
 
2.9%
51
 
2.7%
50
 
2.6%
43
 
2.3%
Other values (290) 1192
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1523
80.6%
Space Separator 158
 
8.4%
Open Punctuation 72
 
3.8%
Close Punctuation 72
 
3.8%
Decimal Number 32
 
1.7%
Uppercase Letter 12
 
0.6%
Dash Punctuation 11
 
0.6%
Lowercase Letter 6
 
0.3%
Other Punctuation 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
72
 
4.7%
64
 
4.2%
60
 
3.9%
55
 
3.6%
51
 
3.3%
50
 
3.3%
43
 
2.8%
43
 
2.8%
35
 
2.3%
28
 
1.8%
Other values (260) 1022
67.1%
Decimal Number
ValueCountFrequency (%)
3 8
25.0%
2 7
21.9%
1 6
18.8%
6 3
 
9.4%
7 2
 
6.2%
4 2
 
6.2%
9 2
 
6.2%
0 1
 
3.1%
8 1
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
C 3
25.0%
A 2
16.7%
N 1
 
8.3%
J 1
 
8.3%
E 1
 
8.3%
T 1
 
8.3%
V 1
 
8.3%
K 1
 
8.3%
S 1
 
8.3%
Lowercase Letter
ValueCountFrequency (%)
y 1
16.7%
n 1
16.7%
a 1
16.7%
p 1
16.7%
m 1
16.7%
o 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
158
100.0%
Open Punctuation
ValueCountFrequency (%)
( 72
100.0%
Close Punctuation
ValueCountFrequency (%)
) 72
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1523
80.6%
Common 348
 
18.4%
Latin 18
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
72
 
4.7%
64
 
4.2%
60
 
3.9%
55
 
3.6%
51
 
3.3%
50
 
3.3%
43
 
2.8%
43
 
2.8%
35
 
2.3%
28
 
1.8%
Other values (260) 1022
67.1%
Common
ValueCountFrequency (%)
158
45.4%
( 72
20.7%
) 72
20.7%
- 11
 
3.2%
3 8
 
2.3%
2 7
 
2.0%
1 6
 
1.7%
6 3
 
0.9%
7 2
 
0.6%
4 2
 
0.6%
Other values (5) 7
 
2.0%
Latin
ValueCountFrequency (%)
C 3
16.7%
A 2
 
11.1%
y 1
 
5.6%
n 1
 
5.6%
a 1
 
5.6%
p 1
 
5.6%
m 1
 
5.6%
o 1
 
5.6%
N 1
 
5.6%
J 1
 
5.6%
Other values (5) 5
27.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1523
80.6%
ASCII 366
 
19.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
158
43.2%
( 72
19.7%
) 72
19.7%
- 11
 
3.0%
3 8
 
2.2%
2 7
 
1.9%
1 6
 
1.6%
C 3
 
0.8%
6 3
 
0.8%
7 2
 
0.5%
Other values (20) 24
 
6.6%
Hangul
ValueCountFrequency (%)
72
 
4.7%
64
 
4.2%
60
 
3.9%
55
 
3.6%
51
 
3.3%
50
 
3.3%
43
 
2.8%
43
 
2.8%
35
 
2.3%
28
 
1.8%
Other values (260) 1022
67.1%
Distinct107
Distinct (%)74.8%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2023-12-13T00:39:55.143457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1716
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)72.7%

Sample

1st row033-648-8161
2nd row033-480-1441
3rd row033-373-9930
4th row033-730-3939
5th row033-560-2337
ValueCountFrequency (%)
063-275-1007 35
 
24.5%
063-620-5628 2
 
1.4%
043-530-4524 2
 
1.4%
041-585-7711 1
 
0.7%
063-471-8256 1
 
0.7%
061-688-3648 1
 
0.7%
052-704-7079 1
 
0.7%
061-273-1008 1
 
0.7%
061-270-8885 1
 
0.7%
062-373-2347 1
 
0.7%
Other values (97) 97
67.8%
2023-12-13T00:39:55.619821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 293
17.1%
- 286
16.7%
3 199
11.6%
5 159
9.3%
1 159
9.3%
7 153
8.9%
2 140
8.2%
6 123
7.2%
4 86
 
5.0%
8 68
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1430
83.3%
Dash Punctuation 286
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 293
20.5%
3 199
13.9%
5 159
11.1%
1 159
11.1%
7 153
10.7%
2 140
9.8%
6 123
8.6%
4 86
 
6.0%
8 68
 
4.8%
9 50
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1716
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 293
17.1%
- 286
16.7%
3 199
11.6%
5 159
9.3%
1 159
9.3%
7 153
8.9%
2 140
8.2%
6 123
7.2%
4 86
 
5.0%
8 68
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 293
17.1%
- 286
16.7%
3 199
11.6%
5 159
9.3%
1 159
9.3%
7 153
8.9%
2 140
8.2%
6 123
7.2%
4 86
 
5.0%
8 68
 
4.0%

발생량(톤)
Real number (ℝ)

Distinct133
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean532.63691
Minimum0
Maximum34446.9
Zeros1
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T00:39:55.784388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.1145
Q11.56
median3.78
Q335.535
95-th percentile712.6592
Maximum34446.9
Range34446.9
Interquartile range (IQR)33.975

Descriptive statistics

Standard deviation3483.9805
Coefficient of variation (CV)6.5410046
Kurtosis75.774015
Mean532.63691
Median Absolute Deviation (MAD)3.38
Skewness8.5402852
Sum76167.078
Variance12138120
MonotonicityNot monotonic
2023-12-13T00:39:55.937835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.4 3
 
2.1%
0.6 3
 
2.1%
0.4 2
 
1.4%
0.05 2
 
1.4%
1.59 2
 
1.4%
0.8 2
 
1.4%
1.56 2
 
1.4%
2.0 2
 
1.4%
0.69 1
 
0.7%
5.66 1
 
0.7%
Other values (123) 123
86.0%
ValueCountFrequency (%)
0.0 1
0.7%
0.00137 1
0.7%
0.02 1
0.7%
0.03 1
0.7%
0.033883 1
0.7%
0.05 2
1.4%
0.11 1
0.7%
0.155 1
0.7%
0.2 1
0.7%
0.24 1
0.7%
ValueCountFrequency (%)
34446.9 1
0.7%
23060.0 1
0.7%
5666.93 1
0.7%
2511.65 1
0.7%
1786.45 1
0.7%
1033.33 1
0.7%
743.85 1
0.7%
713.6 1
0.7%
704.192 1
0.7%
594.57 1
0.7%

Interactions

2023-12-13T00:39:52.836210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:39:56.026401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도시군구발생량(톤)
시도1.0000.9940.000
시군구0.9941.0000.845
발생량(톤)0.0000.8451.000
2023-12-13T00:39:56.103544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생량(톤)시도
발생량(톤)1.0000.000
시도0.0001.000

Missing values

2023-12-13T00:39:53.284006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:39:53.399329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구업체명연락처발생량(톤)
0강원도강릉시(주)강원건설-강릉시 임영로 161, 임영로179번길 37 석면해체 제거033-648-81611.98
1강원도양구군양구고 기계실습실 증축 및 보수 지정폐기물 철거공사033-480-14410.332
2강원도영월군대성엠디아이(주) 라임켐센터033-373-99300.62
3강원도원주시에이치디씨리조트 주식회사033-730-393942.333
4강원도정선군정선군쓰레기위생매립장 비산재 위탁처리용역033-560-233724.23
5경기도가평군가평군분뇨처리시설031-580-44421.75
6경기도고양시 일산동구고양도시관리공사고양시환경에너지시설031-909-48132511.65
7경기도군포시(주)군포종합폐차장031-451-90405.08
8경기도군포시(주)동원테크031-427-10560.8
9경기도김포시구래동천 정비사업031-980-29220.00137
시도시군구업체명연락처발생량(톤)
133충청북도음성군주식회사 알트민.043-873-49911.56
134충청북도음성군충청내륙고속화(제2공구) 도로건설공사 석면 폐기물042-670-357212.12
135충청북도진천군서한산업(주)3공장043-530-4524147.643
136충청북도진천군서한산업(주)제2공장043-530-4524575.5
137충청북도진천군진천산림항공관리소(진천)043-530-78031.8
138충청북도진천군케이제이코043-533-997010.22
139충청북도청주시 청원구청주청원경찰서석면교체공사043-215-132323060.0
140충청북도청주시 흥덕구(주)송정043-232-76278.63
141충청북도충주시(주)대원포리머043-840-2600208.278
142충청북도충주시글로텍 주식회사043-720-7733107.551