Overview

Dataset statistics

Number of variables5
Number of observations41
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory44.2 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description부산광역시동래구_사업장폐기물배출자현황_20230713
Author부산광역시 동래구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15060260

Alerts

연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
폐기물종류 is highly overall correlated with 구분High correlation
구분 is highly imbalanced (71.9%)Imbalance
연번 has unique valuesUnique
업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:26:01.851879
Analysis finished2023-12-10 16:26:03.073213
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21
Minimum1
Maximum41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-11T01:26:03.166597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q111
median21
Q331
95-th percentile39
Maximum41
Range40
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.979149
Coefficient of variation (CV)0.57043565
Kurtosis-1.2
Mean21
Median Absolute Deviation (MAD)10
Skewness0
Sum861
Variance143.5
MonotonicityStrictly increasing
2023-12-11T01:26:03.370919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1 1
 
2.4%
32 1
 
2.4%
24 1
 
2.4%
25 1
 
2.4%
26 1
 
2.4%
27 1
 
2.4%
28 1
 
2.4%
29 1
 
2.4%
30 1
 
2.4%
31 1
 
2.4%
Other values (31) 31
75.6%
ValueCountFrequency (%)
1 1
2.4%
2 1
2.4%
3 1
2.4%
4 1
2.4%
5 1
2.4%
6 1
2.4%
7 1
2.4%
8 1
2.4%
9 1
2.4%
10 1
2.4%
ValueCountFrequency (%)
41 1
2.4%
40 1
2.4%
39 1
2.4%
38 1
2.4%
37 1
2.4%
36 1
2.4%
35 1
2.4%
34 1
2.4%
33 1
2.4%
32 1
2.4%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
비배출시설계
39 
배출시설계
 
2

Length

Max length6
Median length6
Mean length5.9512195
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비배출시설계
2nd row비배출시설계
3rd row비배출시설계
4th row비배출시설계
5th row비배출시설계

Common Values

ValueCountFrequency (%)
비배출시설계 39
95.1%
배출시설계 2
 
4.9%

Length

2023-12-11T01:26:03.577003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:26:03.774922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
비배출시설계 39
95.1%
배출시설계 2
 
4.9%

업체명
Text

UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-11T01:26:04.091216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length13
Mean length8
Min length4

Characters and Unicode

Total characters328
Distinct characters148
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st row(주)메가마트 동래점
2nd row대동병원
3rd row(주)호텔농심
4th row우리들병원
5th row에스케이허브스카이주식회사
ValueCountFrequency (%)
주)메가마트 1
 
2.2%
롯데자이언트 1
 
2.2%
주)비케이더블유메디컬센터 1
 
2.2%
준요양병원 1
 
2.2%
성산현대요양병원 1
 
2.2%
새힘병원 1
 
2.2%
대한불교조계종선암사 1
 
2.2%
의)은성의료재단좋은애인병원 1
 
2.2%
한국전력동래 1
 
2.2%
숨은골목번영회 1
 
2.2%
Other values (35) 35
77.8%
2023-12-11T01:26:04.625211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
5.5%
15
 
4.6%
10
 
3.0%
9
 
2.7%
8
 
2.4%
8
 
2.4%
7
 
2.1%
( 7
 
2.1%
) 7
 
2.1%
6
 
1.8%
Other values (138) 233
71.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 306
93.3%
Open Punctuation 7
 
2.1%
Close Punctuation 7
 
2.1%
Other Symbol 4
 
1.2%
Space Separator 4
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
5.9%
15
 
4.9%
10
 
3.3%
9
 
2.9%
8
 
2.6%
8
 
2.6%
7
 
2.3%
6
 
2.0%
5
 
1.6%
5
 
1.6%
Other values (134) 215
70.3%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 310
94.5%
Common 18
 
5.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
5.8%
15
 
4.8%
10
 
3.2%
9
 
2.9%
8
 
2.6%
8
 
2.6%
7
 
2.3%
6
 
1.9%
5
 
1.6%
5
 
1.6%
Other values (135) 219
70.6%
Common
ValueCountFrequency (%)
( 7
38.9%
) 7
38.9%
4
22.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 306
93.3%
ASCII 18
 
5.5%
None 4
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
 
5.9%
15
 
4.9%
10
 
3.3%
9
 
2.9%
8
 
2.6%
8
 
2.6%
7
 
2.3%
6
 
2.0%
5
 
1.6%
5
 
1.6%
Other values (134) 215
70.3%
ASCII
ValueCountFrequency (%)
( 7
38.9%
) 7
38.9%
4
22.2%
None
ValueCountFrequency (%)
4
100.0%
Distinct39
Distinct (%)95.1%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-11T01:26:04.975086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length23.926829
Min length20

Characters and Unicode

Total characters981
Distinct characters73
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)90.2%

Sample

1st row부산광역시 동래구 충렬대로 197(명륜동)
2nd row부산광역시 동래구 충렬대로 187(명륜동)
3rd row부산광역시 동래구 금강공원로 20번길23(온천동)
4th row부산광역시 동래구 중앙대로 1523(온천동)
5th row부산광역시 동래구 중앙대로 1523, 2층 관리사무소(온천동)
ValueCountFrequency (%)
부산광역시 41
25.3%
동래구 41
25.3%
충렬대로 9
 
5.6%
중앙대로 3
 
1.9%
아시아드대로 3
 
1.9%
여고로 2
 
1.2%
1523(온천동 2
 
1.2%
중앙대로1393(온천동 2
 
1.2%
반송로 2
 
1.2%
금강공원로 2
 
1.2%
Other values (55) 55
34.0%
2023-12-11T01:26:05.485275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
121
 
12.3%
84
 
8.6%
47
 
4.8%
42
 
4.3%
41
 
4.2%
41
 
4.2%
) 41
 
4.2%
41
 
4.2%
( 41
 
4.2%
41
 
4.2%
Other values (63) 441
45.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 647
66.0%
Decimal Number 128
 
13.0%
Space Separator 121
 
12.3%
Close Punctuation 41
 
4.2%
Open Punctuation 41
 
4.2%
Other Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
84
13.0%
47
 
7.3%
42
 
6.5%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
40
 
6.2%
20
 
3.1%
Other values (49) 209
32.3%
Decimal Number
ValueCountFrequency (%)
1 28
21.9%
2 21
16.4%
3 17
13.3%
4 14
10.9%
5 12
9.4%
8 11
 
8.6%
9 8
 
6.2%
7 7
 
5.5%
0 6
 
4.7%
6 4
 
3.1%
Space Separator
ValueCountFrequency (%)
121
100.0%
Close Punctuation
ValueCountFrequency (%)
) 41
100.0%
Open Punctuation
ValueCountFrequency (%)
( 41
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 647
66.0%
Common 334
34.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
84
13.0%
47
 
7.3%
42
 
6.5%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
40
 
6.2%
20
 
3.1%
Other values (49) 209
32.3%
Common
ValueCountFrequency (%)
121
36.2%
) 41
 
12.3%
( 41
 
12.3%
1 28
 
8.4%
2 21
 
6.3%
3 17
 
5.1%
4 14
 
4.2%
5 12
 
3.6%
8 11
 
3.3%
9 8
 
2.4%
Other values (4) 20
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 647
66.0%
ASCII 334
34.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
121
36.2%
) 41
 
12.3%
( 41
 
12.3%
1 28
 
8.4%
2 21
 
6.3%
3 17
 
5.1%
4 14
 
4.2%
5 12
 
3.6%
8 11
 
3.3%
9 8
 
2.4%
Other values (4) 20
 
6.0%
Hangul
ValueCountFrequency (%)
84
13.0%
47
 
7.3%
42
 
6.5%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
41
 
6.3%
40
 
6.2%
20
 
3.1%
Other values (49) 209
32.3%

폐기물종류
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)43.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
음식물류,그밖의폐기물(일반쓰레기)
15 
음식물류, 그밖의폐기물(일반쓰레기)
그밖의폐기물(일반쓰레기)
음식물류,그밖의폐기물(일반쓰레기
 
1
음식물류, 폐합성수지, 그밖의폐기물(일반쓰레기)
 
1
Other values (13)
13 

Length

Max length43
Median length32
Mean length18.878049
Min length3

Unique

Unique15 ?
Unique (%)36.6%

Sample

1st row음식물류,폐합성수지,그밖의폐기물(일반쓰레기)
2nd row음식물류,그밖의폐기물
3rd row음식물류, 폐합성수지, 그밖의폐기물(일반쓰레기)
4th row그밖의 폐섬유, 음식물폐기물,그밖의폐기물
5th row음식물류,그밖의폐기물(일반쓰레기)

Common Values

ValueCountFrequency (%)
음식물류,그밖의폐기물(일반쓰레기) 15
36.6%
음식물류, 그밖의폐기물(일반쓰레기) 7
17.1%
그밖의폐기물(일반쓰레기) 4
 
9.8%
음식물류,그밖의폐기물(일반쓰레기 1
 
2.4%
음식물류, 폐합성수지, 그밖의폐기물(일반쓰레기) 1
 
2.4%
그밖의 폐섬유, 음식물폐기물,그밖의폐기물 1
 
2.4%
폐합성수지류,음식물류,그밖의폐기물 1
 
2.4%
폐발포합성,폐식용유,축산물가공잔재물,동물성유지,음식물류,그밖의폐기물,폐합성수지 1
 
2.4%
폐발포합성수지,음식물류,그밖의폐기물,동물성잔재물 1
 
2.4%
음식물류,그밖의폐기물 1
 
2.4%
Other values (8) 8
19.5%

Length

2023-12-11T01:26:05.661229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
음식물류,그밖의폐기물(일반쓰레기 17
31.5%
그밖의폐기물(일반쓰레기 13
24.1%
음식물류 8
14.8%
그밖의폐섬유 2
 
3.7%
음식물류,그밖의폐기물 1
 
1.9%
음식물류,폐합성수지,그밖의폐기물(일반쓰레기 1
 
1.9%
폐발포합성,축산물가공잔재물,폐식용유,음식물류,그밖의폐기물 1
 
1.9%
정수처리오니 1
 
1.9%
폐발포합성수지,플라스틱폐포장제,폐유리병류,음식물,일반쓰레기 1
 
1.9%
폐전주 1
 
1.9%
Other values (8) 8
14.8%

Interactions

2023-12-11T01:26:02.278616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:26:05.814515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분업체명도로명주소폐기물종류
연번1.0000.7291.0000.9670.674
구분0.7291.0001.0001.0001.000
업체명1.0001.0001.0001.0001.000
도로명주소0.9671.0001.0001.0000.000
폐기물종류0.6741.0001.0000.0001.000
2023-12-11T01:26:05.923768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분폐기물종류
구분1.0000.768
폐기물종류0.7681.000
2023-12-11T01:26:06.035414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분폐기물종류
연번1.0000.5060.225
구분0.5061.0000.768
폐기물종류0.2250.7681.000

Missing values

2023-12-11T01:26:02.906235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:26:03.028629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분업체명도로명주소폐기물종류
01비배출시설계(주)메가마트 동래점부산광역시 동래구 충렬대로 197(명륜동)음식물류,폐합성수지,그밖의폐기물(일반쓰레기)
12비배출시설계대동병원부산광역시 동래구 충렬대로 187(명륜동)음식물류,그밖의폐기물
23비배출시설계(주)호텔농심부산광역시 동래구 금강공원로 20번길23(온천동)음식물류, 폐합성수지, 그밖의폐기물(일반쓰레기)
34비배출시설계우리들병원부산광역시 동래구 중앙대로 1523(온천동)그밖의 폐섬유, 음식물폐기물,그밖의폐기물
45비배출시설계에스케이허브스카이주식회사부산광역시 동래구 중앙대로 1523, 2층 관리사무소(온천동)음식물류,그밖의폐기물(일반쓰레기)
56비배출시설계금강공원사업소부산광역시 동래구 우장춘로155(온천동)그밖의폐기물(일반쓰레기)
67비배출시설계세계로병원부산광역시 동래구 종합운동장로 42(사직동)폐합성수지류,음식물류,그밖의폐기물
78비배출시설계부산광역시체육시설사업소부산광역시 동래구 사직로45(사직동)그밖의폐기물(일반쓰레기)
89비배출시설계장례식장아시아드부산광역시 동래구 여고로42(사직동)음식물류,그밖의폐기물(일반쓰레기)
910비배출시설계봉생병원부산광역시 동래구 안연로109번길27(안락동)음식물류,그밖의폐기물(일반쓰레기)
연번구분업체명도로명주소폐기물종류
3132비배출시설계㈜프라임장례원부산광역시 동래구 미남로 146번길 11(온천동)음식물류, 그밖의폐기물(일반쓰레기)
3233비배출시설계㈜푸드엔 동래지점부산광역시 동래구 온천장로 24(온천동)음식물류, 그밖의폐기물(일반쓰레기)
3334비배출시설계세인요양병원부산광역시 동래구 여고로 5(사직동)그밖의폐섬유, 그밖의폐기물(일반쓰레기)
3435비배출시설계잼있는부엌부산광역시 동래구 연안로58번길 14(안락동)음식물류, 그밖의폐기물(일반쓰레기)
3536비배출시설계㈜착한전문장례식장부산광역시 동래구 반송로 183(낙민동)그밖의폐기물(일반쓰레기)
3637비배출시설계보람상조개발㈜부산대동병원장례식장부산광역시 동래구 충렬대로181번길 22(명륜동)음식물류, 그밖의폐기물(일반쓰레기)
3738비배출시설계거인병원부산광역시 동래구 여고로 129(사직동)음식물류, 그밖의폐기물(일반쓰레기)
3839비배출시설계용인고등학교부산광역시 동래구 시실로107번길 3(명장동)음식물류, 그밖의폐기물(일반쓰레기)
3940배출시설계명장정수사업소부산광역시 동래구 반송로 310(명장동)정수처리오니
4041배출시설계부산환경공단수영사업소부산광역시 동래구 온천천남로185(안락동)사업장폐기물(하수슬러지)