Overview

Dataset statistics

Number of variables6
Number of observations96
Missing cells14
Missing cells (%)2.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.7 KiB
Average record size in memory50.4 B

Variable types

Numeric1
Categorical2
Text3

Dataset

Description부산광역시_사상구_폐기물업체현황_20230328
Author부산광역시 사상구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15025670

Alerts

비고 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
업종 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 업종 and 1 other fieldsHigh correlation
비고 is highly imbalanced (75.0%)Imbalance
연락처 has 14 (14.6%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:28:51.536528
Analysis finished2023-12-10 16:28:52.344478
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct96
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.5
Minimum1
Maximum96
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size996.0 B
2023-12-11T01:28:52.423048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.75
Q124.75
median48.5
Q372.25
95-th percentile91.25
Maximum96
Range95
Interquartile range (IQR)47.5

Descriptive statistics

Standard deviation27.856777
Coefficient of variation (CV)0.57436653
Kurtosis-1.2
Mean48.5
Median Absolute Deviation (MAD)24
Skewness0
Sum4656
Variance776
MonotonicityStrictly increasing
2023-12-11T01:28:52.581613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
50 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
67 1
 
1.0%
66 1
 
1.0%
65 1
 
1.0%
Other values (86) 86
89.6%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%
90 1
1.0%
89 1
1.0%
88 1
1.0%
87 1
1.0%

업종
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Memory size900.0 B
사업장폐기물수집운반업(배출시설계)
39 
건설폐기물수집운반업
15 
사업장폐기물수집운반업(비배출시설계)
폐기물중간재활용업
폐기물종합재활용업
Other values (6)
20 

Length

Max length19
Median length18
Mean length14.552083
Min length9

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row건설폐기물수집운반업
2nd row건설폐기물수집운반업
3rd row건설폐기물수집운반업
4th row건설폐기물수집운반업
5th row건설폐기물수집운반업

Common Values

ValueCountFrequency (%)
사업장폐기물수집운반업(배출시설계) 39
40.6%
건설폐기물수집운반업 15
 
15.6%
사업장폐기물수집운반업(비배출시설계) 8
 
8.3%
폐기물중간재활용업 7
 
7.3%
폐기물종합재활용업 7
 
7.3%
폐기물처리시설(압축시설) 7
 
7.3%
건설폐기물중간처리업 4
 
4.2%
사업장폐기물수집운반업(생활폐기물) 3
 
3.1%
폐기물처리신고(수집운반) 3
 
3.1%
폐기물처리신고(재활용) 2
 
2.1%

Length

2023-12-11T01:28:52.718975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사업장폐기물수집운반업(배출시설계 39
40.2%
건설폐기물수집운반업 15
 
15.5%
사업장폐기물수집운반업(비배출시설계 8
 
8.2%
폐기물중간재활용업 7
 
7.2%
폐기물종합재활용업 7
 
7.2%
폐기물처리시설(압축시설 7
 
7.2%
건설폐기물중간처리업 4
 
4.1%
사업장폐기물수집운반업(생활폐기물 3
 
3.1%
폐기물처리신고(수집운반 3
 
3.1%
폐기물처리신고(재활용 2
 
2.1%
Other values (2) 2
 
2.1%
Distinct80
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Memory size900.0 B
2023-12-11T01:28:52.998751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length6.5625
Min length2

Characters and Unicode

Total characters630
Distinct characters109
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)67.7%

Sample

1st row(주)동아에너지
2nd row(주)풍영
3rd row부산리사이클링(주)
4th row호제환경산업(주)
5th row금강산개발(주)
ValueCountFrequency (%)
주)호생환경 3
 
3.1%
주)삼정환경산업 2
 
2.1%
대성기업(주 2
 
2.1%
대우이앤씨 2
 
2.1%
주)대흥리사이클링 2
 
2.1%
우호기업 2
 
2.1%
주)대흥리사이클링-폐지 2
 
2.1%
청신산업(주 2
 
2.1%
주)이놉스 2
 
2.1%
아진산업(주 2
 
2.1%
Other values (71) 76
78.4%
2023-12-11T01:28:53.692378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
8.7%
( 52
 
8.3%
) 52
 
8.3%
33
 
5.2%
31
 
4.9%
30
 
4.8%
29
 
4.6%
15
 
2.4%
15
 
2.4%
12
 
1.9%
Other values (99) 306
48.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 517
82.1%
Open Punctuation 52
 
8.3%
Close Punctuation 52
 
8.3%
Dash Punctuation 6
 
1.0%
Other Symbol 2
 
0.3%
Space Separator 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
10.6%
33
 
6.4%
31
 
6.0%
30
 
5.8%
29
 
5.6%
15
 
2.9%
15
 
2.9%
12
 
2.3%
11
 
2.1%
10
 
1.9%
Other values (94) 276
53.4%
Open Punctuation
ValueCountFrequency (%)
( 52
100.0%
Close Punctuation
ValueCountFrequency (%)
) 52
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 519
82.4%
Common 111
 
17.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
10.6%
33
 
6.4%
31
 
6.0%
30
 
5.8%
29
 
5.6%
15
 
2.9%
15
 
2.9%
12
 
2.3%
11
 
2.1%
10
 
1.9%
Other values (95) 278
53.6%
Common
ValueCountFrequency (%)
( 52
46.8%
) 52
46.8%
- 6
 
5.4%
1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 517
82.1%
ASCII 111
 
17.6%
None 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
10.6%
33
 
6.4%
31
 
6.0%
30
 
5.8%
29
 
5.6%
15
 
2.9%
15
 
2.9%
12
 
2.3%
11
 
2.1%
10
 
1.9%
Other values (94) 276
53.4%
ASCII
ValueCountFrequency (%)
( 52
46.8%
) 52
46.8%
- 6
 
5.4%
1
 
0.9%
None
ValueCountFrequency (%)
2
100.0%
Distinct75
Distinct (%)78.1%
Missing0
Missing (%)0.0%
Memory size900.0 B
2023-12-11T01:28:53.954411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length33
Mean length19.34375
Min length10

Characters and Unicode

Total characters1857
Distinct characters112
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)60.4%

Sample

1st row낙동대로1062번길 17(감전동)
2nd row장인로37번길 116(감전동)
3rd row낙동대로901번길 25-18(감전동)
4th row낙동대로 910, E동 315호(마트월드, 감전동)
5th row낙동대로1310번길 63(삼락동)
ValueCountFrequency (%)
낙동대로 15
 
5.5%
새벽로 7
 
2.6%
광장로 6
 
2.2%
부산산업용재유통상가 5
 
1.8%
131 5
 
1.8%
101-1(괘법동 4
 
1.5%
모덕로 4
 
1.5%
농산물시장로 3
 
1.1%
감전천로 3
 
1.1%
학장로 3
 
1.1%
Other values (169) 216
79.7%
2023-12-11T01:28:54.400590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
175
 
9.4%
141
 
7.6%
1 106
 
5.7%
) 97
 
5.2%
( 97
 
5.2%
96
 
5.2%
2 70
 
3.8%
0 64
 
3.4%
3 58
 
3.1%
53
 
2.9%
Other values (102) 900
48.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 922
49.6%
Decimal Number 497
26.8%
Space Separator 175
 
9.4%
Close Punctuation 97
 
5.2%
Open Punctuation 97
 
5.2%
Other Punctuation 41
 
2.2%
Dash Punctuation 18
 
1.0%
Lowercase Letter 5
 
0.3%
Uppercase Letter 5
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
141
 
15.3%
96
 
10.4%
53
 
5.7%
51
 
5.5%
36
 
3.9%
33
 
3.6%
33
 
3.6%
31
 
3.4%
25
 
2.7%
21
 
2.3%
Other values (79) 402
43.6%
Decimal Number
ValueCountFrequency (%)
1 106
21.3%
2 70
14.1%
0 64
12.9%
3 58
11.7%
6 44
8.9%
4 40
 
8.0%
7 38
 
7.6%
5 38
 
7.6%
9 21
 
4.2%
8 18
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
e 2
40.0%
n 1
20.0%
t 1
20.0%
r 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
C 1
20.0%
E 1
20.0%
D 1
20.0%
Space Separator
ValueCountFrequency (%)
175
100.0%
Close Punctuation
ValueCountFrequency (%)
) 97
100.0%
Open Punctuation
ValueCountFrequency (%)
( 97
100.0%
Other Punctuation
ValueCountFrequency (%)
, 41
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 925
49.8%
Hangul 922
49.6%
Latin 10
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
141
 
15.3%
96
 
10.4%
53
 
5.7%
51
 
5.5%
36
 
3.9%
33
 
3.6%
33
 
3.6%
31
 
3.4%
25
 
2.7%
21
 
2.3%
Other values (79) 402
43.6%
Common
ValueCountFrequency (%)
175
18.9%
1 106
11.5%
) 97
10.5%
( 97
10.5%
2 70
 
7.6%
0 64
 
6.9%
3 58
 
6.3%
6 44
 
4.8%
, 41
 
4.4%
4 40
 
4.3%
Other values (5) 133
14.4%
Latin
ValueCountFrequency (%)
e 2
20.0%
A 2
20.0%
C 1
10.0%
E 1
10.0%
D 1
10.0%
n 1
10.0%
t 1
10.0%
r 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 935
50.4%
Hangul 922
49.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
175
18.7%
1 106
11.3%
) 97
10.4%
( 97
10.4%
2 70
 
7.5%
0 64
 
6.8%
3 58
 
6.2%
6 44
 
4.7%
, 41
 
4.4%
4 40
 
4.3%
Other values (13) 143
15.3%
Hangul
ValueCountFrequency (%)
141
 
15.3%
96
 
10.4%
53
 
5.7%
51
 
5.5%
36
 
3.9%
33
 
3.6%
33
 
3.6%
31
 
3.4%
25
 
2.7%
21
 
2.3%
Other values (79) 402
43.6%

연락처
Text

MISSING 

Distinct62
Distinct (%)75.6%
Missing14
Missing (%)14.6%
Memory size900.0 B
2023-12-11T01:28:54.644521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.060976
Min length12

Characters and Unicode

Total characters989
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)56.1%

Sample

1st row051-317-0371
2nd row051-324-7288
3rd row051-327-2727
4th row051-710-1440
5th row051-301-1655
ValueCountFrequency (%)
051-315-5608 4
 
4.9%
051-312-7991 3
 
3.7%
051-327-1332 3
 
3.7%
051-313-0382 2
 
2.4%
051-317-0371 2
 
2.4%
051-303-5234 2
 
2.4%
051-301-8201 2
 
2.4%
070-4202-7057 2
 
2.4%
051-746-0840 2
 
2.4%
051-303-8260 2
 
2.4%
Other values (52) 58
70.7%
2023-12-11T01:28:55.029723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 164
16.6%
1 158
16.0%
0 154
15.6%
3 130
13.1%
5 120
12.1%
2 69
7.0%
7 60
 
6.1%
4 46
 
4.7%
8 31
 
3.1%
6 30
 
3.0%
Other values (2) 27
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 824
83.3%
Dash Punctuation 164
 
16.6%
Space Separator 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 158
19.2%
0 154
18.7%
3 130
15.8%
5 120
14.6%
2 69
8.4%
7 60
 
7.3%
4 46
 
5.6%
8 31
 
3.8%
6 30
 
3.6%
9 26
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 164
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 989
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 164
16.6%
1 158
16.0%
0 154
15.6%
3 130
13.1%
5 120
12.1%
2 69
7.0%
7 60
 
6.1%
4 46
 
4.7%
8 31
 
3.1%
6 30
 
3.0%
Other values (2) 27
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 989
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 164
16.6%
1 158
16.0%
0 154
15.6%
3 130
13.1%
5 120
12.1%
2 69
7.0%
7 60
 
6.1%
4 46
 
4.7%
8 31
 
3.1%
6 30
 
3.0%
Other values (2) 27
 
2.7%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size900.0 B
<NA>
92 
임시보관장소
 
4

Length

Max length6
Median length4
Mean length4.0833333
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row임시보관장소
3rd row<NA>
4th row<NA>
5th row임시보관장소

Common Values

ValueCountFrequency (%)
<NA> 92
95.8%
임시보관장소 4
 
4.2%

Length

2023-12-11T01:28:55.179234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:28:55.278218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 92
95.8%
임시보관장소 4
 
4.2%

Interactions

2023-12-11T01:28:52.082273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:28:55.333887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종사업장명소재지연락처
연번1.0000.8690.0000.6440.000
업종0.8691.0000.0000.0000.000
사업장명0.0000.0001.0001.0001.000
소재지0.6440.0001.0001.0000.999
연락처0.0000.0001.0000.9991.000
2023-12-11T01:28:55.428999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비고업종
비고1.0001.000
업종1.0001.000
2023-12-11T01:28:55.523100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종비고
연번1.0000.6051.000
업종0.6051.0001.000
비고1.0001.0001.000

Missing values

2023-12-11T01:28:52.197138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:28:52.299537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종사업장명소재지연락처비고
01건설폐기물수집운반업(주)동아에너지낙동대로1062번길 17(감전동)051-317-0371<NA>
12건설폐기물수집운반업(주)풍영장인로37번길 116(감전동)051-324-7288임시보관장소
23건설폐기물수집운반업부산리사이클링(주)낙동대로901번길 25-18(감전동)051-327-2727<NA>
34건설폐기물수집운반업호제환경산업(주)낙동대로 910, E동 315호(마트월드, 감전동)051-710-1440<NA>
45건설폐기물수집운반업금강산개발(주)낙동대로1310번길 63(삼락동)051-301-1655임시보관장소
56건설폐기물수집운반업(주)호생환경낙동대로 665(엄궁동)051-327-1332<NA>
67건설폐기물수집운반업건설환경(주)농산물시장로 40-67(엄궁동)051-316-4200<NA>
78건설폐기물수집운반업에코포스트㈜낙동대로 671(엄궁동)051-317-3751<NA>
89건설폐기물수집운반업(주)삼정환경산업하신번영로 462(엄궁동)051-313-0382<NA>
910건설폐기물수집운반업(주)우리환경자원학장로 269(학장동)051-315-4404임시보관장소
연번업종사업장명소재지연락처비고
8687폐기물처리시설(압축시설)그린리사이클링-폐지새벽로45번길 70-14(학장동)070-7123-3337<NA>
8788폐기물처리시설(압축시설)(주)대흥리사이클링-폐지낙동대로 926(감전동)051-315-5608<NA>
8889폐기물처리시설(압축시설)한성산업(주)낙동대로915번길 12(감전동)051-324-7001<NA>
8990폐기물처리시설(압축시설)영조고철-고철사상로385번길 42 (덕포동)<NA><NA>
9091폐기물처리시설(순수발효방식 감량기)농협경제지주 부산공판장농산물시장로 25번길 75(엄궁동)051-314-4989<NA>
9192폐기물처리신고(수집운반)모란자원대동로239번길 13(학장동)051-316-7933<NA>
9293폐기물처리신고(수집운반)(주)대흥리사이클링낙동대로 926(감전동)051-315-5608<NA>
9394폐기물처리신고(수집운반)그린리사이클링새벽로45번길 70-14(학장동)070-7123-3337<NA>
9495폐기물처리신고(재활용)(주)동일케미칼장인로77번길 103(학장동)051-313-0657<NA>
9596폐기물처리신고(재활용)옥미산업모덕로 16(삼락동)051-304-0335<NA>