Overview

Dataset statistics

Number of variables7
Number of observations70
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory58.9 B

Variable types

Numeric1
Text3
DateTime1
Categorical2

Dataset

Description부산광역시남구_사업장폐기물배출자_20210617
Author부산광역시 남구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15060308

Alerts

연번 is highly overall correlated with 폐기물구분High correlation
폐기물구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
폐기물종류 is highly overall correlated with 폐기물구분High correlation
폐기물종류 is highly imbalanced (65.7%)Imbalance
연번 has unique valuesUnique
상호명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 17:14:45.232675
Analysis finished2023-12-10 17:14:47.660771
Duration2.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct70
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.5
Minimum1
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size762.0 B
2023-12-11T02:14:47.842920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.45
Q118.25
median35.5
Q352.75
95-th percentile66.55
Maximum70
Range69
Interquartile range (IQR)34.5

Descriptive statistics

Standard deviation20.351085
Coefficient of variation (CV)0.57327
Kurtosis-1.2
Mean35.5
Median Absolute Deviation (MAD)17.5
Skewness0
Sum2485
Variance414.16667
MonotonicityStrictly increasing
2023-12-11T02:14:48.191260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.4%
46 1
 
1.4%
52 1
 
1.4%
51 1
 
1.4%
50 1
 
1.4%
49 1
 
1.4%
48 1
 
1.4%
47 1
 
1.4%
45 1
 
1.4%
54 1
 
1.4%
Other values (60) 60
85.7%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
70 1
1.4%
69 1
1.4%
68 1
1.4%
67 1
1.4%
66 1
1.4%
65 1
1.4%
64 1
1.4%
63 1
1.4%
62 1
1.4%
61 1
1.4%

상호명
Text

UNIQUE 

Distinct70
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size692.0 B
2023-12-11T02:14:48.782360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length7.0857143
Min length3

Characters and Unicode

Total characters496
Distinct characters156
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)100.0%

Sample

1st row대연중학교
2nd row동명대학교
3rd row해군작전사령부
4th row부경대학교(대연)
5th row이마트 문현점
ValueCountFrequency (%)
감만점 2
 
2.4%
대연중학교 1
 
1.2%
석포여자중학교 1
 
1.2%
동국제강㈜ 1
 
1.2%
동명대학교 1
 
1.2%
그랜드자연요양병원 1
 
1.2%
한국은행부산본부 1
 
1.2%
새라새요양병원 1
 
1.2%
인창대연요양병원 1
 
1.2%
대남초등학교 1
 
1.2%
Other values (72) 72
86.7%
2023-12-11T02:14:49.737754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29
 
5.8%
29
 
5.8%
18
 
3.6%
16
 
3.2%
13
 
2.6%
13
 
2.6%
13
 
2.6%
13
 
2.6%
12
 
2.4%
11
 
2.2%
Other values (146) 329
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 459
92.5%
Space Separator 13
 
2.6%
Other Symbol 10
 
2.0%
Decimal Number 4
 
0.8%
Uppercase Letter 4
 
0.8%
Open Punctuation 3
 
0.6%
Close Punctuation 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
6.3%
29
 
6.3%
18
 
3.9%
16
 
3.5%
13
 
2.8%
13
 
2.8%
13
 
2.8%
12
 
2.6%
11
 
2.4%
10
 
2.2%
Other values (135) 295
64.3%
Uppercase Letter
ValueCountFrequency (%)
S 1
25.0%
K 1
25.0%
L 1
25.0%
G 1
25.0%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
3 1
25.0%
1 1
25.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Other Symbol
ValueCountFrequency (%)
10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 469
94.6%
Common 23
 
4.6%
Latin 4
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
6.2%
29
 
6.2%
18
 
3.8%
16
 
3.4%
13
 
2.8%
13
 
2.8%
13
 
2.8%
12
 
2.6%
11
 
2.3%
10
 
2.1%
Other values (136) 305
65.0%
Common
ValueCountFrequency (%)
13
56.5%
( 3
 
13.0%
) 3
 
13.0%
2 2
 
8.7%
3 1
 
4.3%
1 1
 
4.3%
Latin
ValueCountFrequency (%)
S 1
25.0%
K 1
25.0%
L 1
25.0%
G 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 459
92.5%
ASCII 27
 
5.4%
None 10
 
2.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
29
 
6.3%
29
 
6.3%
18
 
3.9%
16
 
3.5%
13
 
2.8%
13
 
2.8%
13
 
2.8%
12
 
2.6%
11
 
2.4%
10
 
2.2%
Other values (135) 295
64.3%
ASCII
ValueCountFrequency (%)
13
48.1%
( 3
 
11.1%
) 3
 
11.1%
2 2
 
7.4%
S 1
 
3.7%
3 1
 
3.7%
K 1
 
3.7%
1 1
 
3.7%
L 1
 
3.7%
G 1
 
3.7%
None
ValueCountFrequency (%)
10
100.0%
Distinct69
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size692.0 B
2023-12-11T02:14:50.381024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length14.271429
Min length9

Characters and Unicode

Total characters999
Distinct characters77
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)97.1%

Sample

1st row천제등로16번길 81 (대연동)
2nd row신선로 428 (용당동)
3rd row백운포로 95 (용호동)
4th row용소로 45 (대연동)
5th row전포대로91번길 47 (문현동)
ValueCountFrequency (%)
대연동 26
 
12.8%
용호동 13
 
6.4%
용당동 11
 
5.4%
수영로 10
 
4.9%
신선로 9
 
4.4%
감만동 9
 
4.4%
문현동 8
 
3.9%
분포로 4
 
2.0%
신선로102 2
 
1.0%
19 2
 
1.0%
Other values (101) 109
53.7%
2023-12-11T02:14:51.252797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
133
 
13.3%
73
 
7.3%
69
 
6.9%
) 68
 
6.8%
( 68
 
6.8%
1 58
 
5.8%
35
 
3.5%
2 31
 
3.1%
29
 
2.9%
5 27
 
2.7%
Other values (67) 408
40.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 478
47.8%
Decimal Number 238
23.8%
Space Separator 133
 
13.3%
Close Punctuation 68
 
6.8%
Open Punctuation 68
 
6.8%
Dash Punctuation 14
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
73
15.3%
69
14.4%
35
 
7.3%
29
 
6.1%
27
 
5.6%
17
 
3.6%
16
 
3.3%
15
 
3.1%
13
 
2.7%
13
 
2.7%
Other values (53) 171
35.8%
Decimal Number
ValueCountFrequency (%)
1 58
24.4%
2 31
13.0%
5 27
11.3%
3 27
11.3%
4 22
 
9.2%
9 19
 
8.0%
6 15
 
6.3%
8 14
 
5.9%
0 14
 
5.9%
7 11
 
4.6%
Space Separator
ValueCountFrequency (%)
133
100.0%
Close Punctuation
ValueCountFrequency (%)
) 68
100.0%
Open Punctuation
ValueCountFrequency (%)
( 68
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 521
52.2%
Hangul 478
47.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
73
15.3%
69
14.4%
35
 
7.3%
29
 
6.1%
27
 
5.6%
17
 
3.6%
16
 
3.3%
15
 
3.1%
13
 
2.7%
13
 
2.7%
Other values (53) 171
35.8%
Common
ValueCountFrequency (%)
133
25.5%
) 68
13.1%
( 68
13.1%
1 58
11.1%
2 31
 
6.0%
5 27
 
5.2%
3 27
 
5.2%
4 22
 
4.2%
9 19
 
3.6%
6 15
 
2.9%
Other values (4) 53
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 521
52.2%
Hangul 478
47.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
133
25.5%
) 68
13.1%
( 68
13.1%
1 58
11.1%
2 31
 
6.0%
5 27
 
5.2%
3 27
 
5.2%
4 22
 
4.2%
9 19
 
3.6%
6 15
 
2.9%
Other values (4) 53
 
10.2%
Hangul
ValueCountFrequency (%)
73
15.3%
69
14.4%
35
 
7.3%
29
 
6.1%
27
 
5.6%
17
 
3.6%
16
 
3.3%
15
 
3.1%
13
 
2.7%
13
 
2.7%
Other values (53) 171
35.8%
Distinct69
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size692.0 B
2023-12-11T02:14:51.779417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length8
Mean length8.2142857
Min length8

Characters and Unicode

Total characters575
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)97.1%

Sample

1st row606-7705
2nd row610-8195
3rd row679-2032
4th row629-5103
5th row609-1052
ValueCountFrequency (%)
629-5103 2
 
2.9%
626-2841 1
 
1.4%
622-7180 1
 
1.4%
240-3714 1
 
1.4%
628-6005 1
 
1.4%
774-1004 1
 
1.4%
622-9435 1
 
1.4%
610-3303 1
 
1.4%
629-8912 1
 
1.4%
606-7705 1
 
1.4%
Other values (59) 59
84.3%
2023-12-11T02:14:52.586442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 87
15.1%
0 77
13.4%
- 73
12.7%
2 65
11.3%
1 51
8.9%
3 51
8.9%
5 41
7.1%
7 41
7.1%
8 31
 
5.4%
4 31
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 502
87.3%
Dash Punctuation 73
 
12.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 87
17.3%
0 77
15.3%
2 65
12.9%
1 51
10.2%
3 51
10.2%
5 41
8.2%
7 41
8.2%
8 31
 
6.2%
4 31
 
6.2%
9 27
 
5.4%
Dash Punctuation
ValueCountFrequency (%)
- 73
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 87
15.1%
0 77
13.4%
- 73
12.7%
2 65
11.3%
1 51
8.9%
3 51
8.9%
5 41
7.1%
7 41
7.1%
8 31
 
5.4%
4 31
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 87
15.1%
0 77
13.4%
- 73
12.7%
2 65
11.3%
1 51
8.9%
3 51
8.9%
5 41
7.1%
7 41
7.1%
8 31
 
5.4%
4 31
 
5.4%
Distinct47
Distinct (%)67.1%
Missing0
Missing (%)0.0%
Memory size692.0 B
Minimum1999-02-19 00:00:00
Maximum2020-10-22 00:00:00
2023-12-11T02:14:52.976258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:14:53.312689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)

폐기물구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size692.0 B
사업장일반
62 
사업장배출시설

Length

Max length7
Median length5
Mean length5.2285714
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사업장일반
2nd row사업장일반
3rd row사업장일반
4th row사업장일반
5th row사업장일반

Common Values

ValueCountFrequency (%)
사업장일반 62
88.6%
사업장배출시설 8
 
11.4%

Length

2023-12-11T02:14:53.698585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:14:53.991964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사업장일반 62
88.6%
사업장배출시설 8
 
11.4%

폐기물종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size692.0 B
생활폐기물
59 
폐합성수지
 
4
폐콘크리트
 
2
폐석재
 
1
폐수처리오니, 공정오니
 
1
Other values (3)
 
3

Length

Max length12
Median length5
Mean length5.1285714
Min length3

Unique

Unique5 ?
Unique (%)7.1%

Sample

1st row생활폐기물
2nd row생활폐기물
3rd row생활폐기물
4th row생활폐기물
5th row생활폐기물

Common Values

ValueCountFrequency (%)
생활폐기물 59
84.3%
폐합성수지 4
 
5.7%
폐콘크리트 2
 
2.9%
폐석재 1
 
1.4%
폐수처리오니, 공정오니 1
 
1.4%
폐활성탄 1
 
1.4%
하수처리오니 1
 
1.4%
폐합성수지, 분진 1
 
1.4%

Length

2023-12-11T02:14:54.270262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:14:54.548140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
생활폐기물 59
81.9%
폐합성수지 5
 
6.9%
폐콘크리트 2
 
2.8%
폐석재 1
 
1.4%
폐수처리오니 1
 
1.4%
공정오니 1
 
1.4%
폐활성탄 1
 
1.4%
하수처리오니 1
 
1.4%
분진 1
 
1.4%

Interactions

2023-12-11T02:14:46.845560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:14:54.764773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번상호명주 소연락처신고일자폐기물구분폐기물종류
연번1.0001.0000.9380.9380.9620.9930.406
상호명1.0001.0001.0001.0001.0001.0001.000
주 소0.9381.0001.0000.9980.9671.0000.000
연락처0.9381.0000.9981.0000.9671.0001.000
신고일자0.9621.0000.9670.9671.0001.0001.000
폐기물구분0.9931.0001.0001.0001.0001.0000.992
폐기물종류0.4061.0000.0001.0001.0000.9921.000
2023-12-11T02:14:54.981450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폐기물종류폐기물구분
폐기물종류1.0000.877
폐기물구분0.8771.000
2023-12-11T02:14:55.136805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번폐기물구분폐기물종류
연번1.0000.8720.199
폐기물구분0.8721.0000.877
폐기물종류0.1990.8771.000

Missing values

2023-12-11T02:14:47.213258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:14:47.511463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번상호명주 소연락처신고일자폐기물구분폐기물종류
01대연중학교천제등로16번길 81 (대연동)606-77051999-02-19사업장일반생활폐기물
12동명대학교신선로 428 (용당동)610-81952001-03-30사업장일반생활폐기물
23해군작전사령부백운포로 95 (용호동)679-20322001-09-20사업장일반생활폐기물
34부경대학교(대연)용소로 45 (대연동)629-51032002-12-11사업장일반생활폐기물
45이마트 문현점전포대로91번길 47 (문현동)609-10522003-09-06사업장일반생활폐기물
56홈플러스 감만점우암로 124 (감만동)609-81242006-08-07사업장일반생활폐기물
6721센츄리시티수영로 312 (대연동)610-00232007-01-15사업장일반생활폐기물
78항만운영단8부두로 41 (감만동)730-50472007-06-22사업장일반생활폐기물
89경성대학교수영로 309 (대연동)663-40912008-02-29사업장일반생활폐기물
910부산항터미널㈜신선로 294 (용당동)620-02522008-03-19사업장일반폐합성수지
연번상호명주 소연락처신고일자폐기물구분폐기물종류
6061대연 SK뷰힐스상가관리위원회수영로 261 (대연동)611-68632020-08-03사업장일반생활폐기물
6162(주)푸드엔 대연지점석포로 112 (대연동)923-55992020-10-22사업장일반생활폐기물
6263동국제강㈜신선로102 (감만동)640-52371999-11-25사업장배출시설폐수처리오니, 공정오니
6364피피지코리아신선로356번길 21 (용당동)620-82132000-04-08사업장배출시설폐활성탄
6465㈜농협사료 부산바이오우암로337 (문현동)606-19232000-05-10사업장배출시설폐합성수지
6566부산환경공단 남부사업소이기대공원로 11713-01352001-03-29사업장배출시설하수처리오니
6667그린웍스㈜신선로102 (감만동)637-93072017-09-22사업장배출시설폐합성수지, 분진
6768㈜미륭레미콘신선대산복로 30 (용당동)621-55512008-08-18사업장배출시설폐콘크리트
6869㈜한탑용소로 101 (대연동)626-28412015-07-13사업장배출시설폐합성수지
6970한통레미콘 주식회사신선로 229 (용당동)627-33002020-02-28사업장배출시설폐콘크리트