Overview

Dataset statistics

Number of variables8
Number of observations317
Missing cells6
Missing cells (%)0.2%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory20.3 KiB
Average record size in memory65.4 B

Variable types

Numeric1
Categorical5
Text2

Dataset

Description서울특별시 강서구 가로쓰레기통 설치 정보입니다.설치장소 행정동, 설치위치, 세부위치 정보, 쓰레기통 종류, 쓰레기통 형태, 담당자 연락처 정보를 포함하고 있습니다.
Author서울특별시 강서구
URLhttps://www.data.go.kr/data/15086961/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
자치구명 is highly overall correlated with 연번 and 4 other fieldsHigh correlation
설치장소유형 is highly overall correlated with 자치구명 and 1 other fieldsHigh correlation
쓰레기통 형태 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
전화번호 is highly overall correlated with 연번 and 4 other fieldsHigh correlation
수거쓰레기 종류 is highly overall correlated with 자치구명 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 자치구명 and 2 other fieldsHigh correlation
자치구명 is highly imbalanced (94.5%)Imbalance
쓰레기통 형태 is highly imbalanced (83.8%)Imbalance
수거쓰레기 종류 is highly imbalanced (55.1%)Imbalance
전화번호 is highly imbalanced (94.5%)Imbalance

Reproduction

Analysis started2024-04-06 08:33:04.754328
Analysis finished2024-04-06 08:33:06.680679
Duration1.93 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION 

Distinct315
Distinct (%)100.0%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean158
Minimum1
Maximum315
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.9 KiB
2024-04-06T17:33:06.818807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile16.7
Q179.5
median158
Q3236.5
95-th percentile299.3
Maximum315
Range314
Interquartile range (IQR)157

Descriptive statistics

Standard deviation91.076891
Coefficient of variation (CV)0.57643602
Kurtosis-1.2
Mean158
Median Absolute Deviation (MAD)79
Skewness0
Sum49770
Variance8295
MonotonicityStrictly increasing
2024-04-06T17:33:07.265399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
209 1
 
0.3%
216 1
 
0.3%
215 1
 
0.3%
214 1
 
0.3%
213 1
 
0.3%
212 1
 
0.3%
211 1
 
0.3%
210 1
 
0.3%
208 1
 
0.3%
218 1
 
0.3%
Other values (305) 305
96.2%
(Missing) 2
 
0.6%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
315 1
0.3%
314 1
0.3%
313 1
0.3%
312 1
0.3%
311 1
0.3%
310 1
0.3%
309 1
0.3%
308 1
0.3%
307 1
0.3%
306 1
0.3%

자치구명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
서울특별시 강서구
315 
<NA>
 
2

Length

Max length9
Median length9
Mean length8.9684543
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시 강서구
2nd row서울특별시 강서구
3rd row서울특별시 강서구
4th row서울특별시 강서구
5th row서울특별시 강서구

Common Values

ValueCountFrequency (%)
서울특별시 강서구 315
99.4%
<NA> 2
 
0.6%

Length

2024-04-06T17:33:07.619087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:33:07.833385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 315
49.8%
강서구 315
49.8%
na 2
 
0.3%
Distinct212
Distinct (%)67.3%
Missing2
Missing (%)0.6%
Memory size2.6 KiB
2024-04-06T17:33:08.329309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length7.1746032
Min length4

Characters and Unicode

Total characters2260
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)42.9%

Sample

1st row가로공원로 지하189
2nd row가로공원로 지하189
3rd row강서로 지하54
4th row강서로 지하54
5th row화곡로 지하 168
ValueCountFrequency (%)
공항대로 15
 
3.6%
9 8
 
1.9%
마곡중앙5로 8
 
1.9%
지하 7
 
1.7%
허준로 6
 
1.4%
양천로59길38 5
 
1.2%
마곡서1로 5
 
1.2%
화곡로302 5
 
1.2%
가로공원로 5
 
1.2%
개화동로11길 4
 
1.0%
Other values (230) 352
83.8%
2024-04-06T17:33:09.213094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
319
 
14.1%
1 178
 
7.9%
3 129
 
5.7%
119
 
5.3%
2 113
 
5.0%
4 92
 
4.1%
6 82
 
3.6%
7 80
 
3.5%
5 77
 
3.4%
75
 
3.3%
Other values (44) 996
44.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1182
52.3%
Decimal Number 937
41.5%
Space Separator 119
 
5.3%
Dash Punctuation 22
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
319
27.0%
75
 
6.3%
74
 
6.3%
68
 
5.8%
64
 
5.4%
64
 
5.4%
63
 
5.3%
60
 
5.1%
51
 
4.3%
38
 
3.2%
Other values (32) 306
25.9%
Decimal Number
ValueCountFrequency (%)
1 178
19.0%
3 129
13.8%
2 113
12.1%
4 92
9.8%
6 82
8.8%
7 80
8.5%
5 77
8.2%
0 66
 
7.0%
9 65
 
6.9%
8 55
 
5.9%
Space Separator
ValueCountFrequency (%)
119
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1182
52.3%
Common 1078
47.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
319
27.0%
75
 
6.3%
74
 
6.3%
68
 
5.8%
64
 
5.4%
64
 
5.4%
63
 
5.3%
60
 
5.1%
51
 
4.3%
38
 
3.2%
Other values (32) 306
25.9%
Common
ValueCountFrequency (%)
1 178
16.5%
3 129
12.0%
119
11.0%
2 113
10.5%
4 92
8.5%
6 82
7.6%
7 80
7.4%
5 77
7.1%
0 66
 
6.1%
9 65
 
6.0%
Other values (2) 77
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1182
52.3%
ASCII 1078
47.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
319
27.0%
75
 
6.3%
74
 
6.3%
68
 
5.8%
64
 
5.4%
64
 
5.4%
63
 
5.3%
60
 
5.1%
51
 
4.3%
38
 
3.2%
Other values (32) 306
25.9%
ASCII
ValueCountFrequency (%)
1 178
16.5%
3 129
12.0%
119
11.0%
2 113
10.5%
4 92
8.5%
6 82
7.6%
7 80
7.4%
5 77
7.1%
0 66
 
6.1%
9 65
 
6.0%
Other values (2) 77
7.1%
Distinct259
Distinct (%)82.2%
Missing2
Missing (%)0.6%
Memory size2.6 KiB
2024-04-06T17:33:09.707512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length24
Mean length15.701587
Min length5

Characters and Unicode

Total characters4946
Distinct characters249
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique203 ?
Unique (%)64.4%

Sample

1st row나누리병원 16-199
2nd row나누리병원 16-200
3rd row까치산역 16-192
4th row까치산역 16-192
5th row화곡역.현대주유소 16-205
ValueCountFrequency (%)
버스정류장 23
 
4.2%
21
 
3.9%
양천로 8
 
1.5%
강서구청 7
 
1.3%
신방화역 6
 
1.1%
허준로 6
 
1.1%
횡단보도 5
 
0.9%
마곡동로 4
 
0.7%
마곡나루역 4
 
0.7%
출구 4
 
0.7%
Other values (353) 455
83.8%
2024-04-06T17:33:10.446051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 380
 
7.7%
6 269
 
5.4%
235
 
4.8%
- 216
 
4.4%
) 143
 
2.9%
( 143
 
2.9%
2 131
 
2.6%
114
 
2.3%
109
 
2.2%
107
 
2.2%
Other values (239) 3099
62.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2872
58.1%
Decimal Number 1247
25.2%
Space Separator 235
 
4.8%
Dash Punctuation 216
 
4.4%
Close Punctuation 143
 
2.9%
Open Punctuation 143
 
2.9%
Other Punctuation 48
 
1.0%
Uppercase Letter 42
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
 
4.0%
109
 
3.8%
107
 
3.7%
87
 
3.0%
84
 
2.9%
82
 
2.9%
71
 
2.5%
70
 
2.4%
68
 
2.4%
60
 
2.1%
Other values (210) 2020
70.3%
Uppercase Letter
ValueCountFrequency (%)
S 10
23.8%
B 6
14.3%
K 5
11.9%
C 4
 
9.5%
D 3
 
7.1%
L 3
 
7.1%
G 3
 
7.1%
R 2
 
4.8%
I 2
 
4.8%
J 2
 
4.8%
Other values (2) 2
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 380
30.5%
6 269
21.6%
2 131
 
10.5%
0 87
 
7.0%
3 81
 
6.5%
7 73
 
5.9%
4 71
 
5.7%
5 55
 
4.4%
9 52
 
4.2%
8 48
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 35
72.9%
, 11
 
22.9%
& 2
 
4.2%
Space Separator
ValueCountFrequency (%)
235
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 216
100.0%
Close Punctuation
ValueCountFrequency (%)
) 143
100.0%
Open Punctuation
ValueCountFrequency (%)
( 143
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2872
58.1%
Common 2032
41.1%
Latin 42
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
 
4.0%
109
 
3.8%
107
 
3.7%
87
 
3.0%
84
 
2.9%
82
 
2.9%
71
 
2.5%
70
 
2.4%
68
 
2.4%
60
 
2.1%
Other values (210) 2020
70.3%
Common
ValueCountFrequency (%)
1 380
18.7%
6 269
13.2%
235
11.6%
- 216
10.6%
) 143
 
7.0%
( 143
 
7.0%
2 131
 
6.4%
0 87
 
4.3%
3 81
 
4.0%
7 73
 
3.6%
Other values (7) 274
13.5%
Latin
ValueCountFrequency (%)
S 10
23.8%
B 6
14.3%
K 5
11.9%
C 4
 
9.5%
D 3
 
7.1%
L 3
 
7.1%
G 3
 
7.1%
R 2
 
4.8%
I 2
 
4.8%
J 2
 
4.8%
Other values (2) 2
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2872
58.1%
ASCII 2074
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 380
18.3%
6 269
13.0%
235
11.3%
- 216
10.4%
) 143
 
6.9%
( 143
 
6.9%
2 131
 
6.3%
0 87
 
4.2%
3 81
 
3.9%
7 73
 
3.5%
Other values (19) 316
15.2%
Hangul
ValueCountFrequency (%)
114
 
4.0%
109
 
3.8%
107
 
3.7%
87
 
3.0%
84
 
2.9%
82
 
2.9%
71
 
2.5%
70
 
2.4%
68
 
2.4%
60
 
2.1%
Other values (210) 2020
70.3%

설치장소유형
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
정류소(버스,택시 등)
236 
기타
35 
지하철역 입구
27 
도로변(횡단보도 포함)
 
17
<NA>
 
2

Length

Max length12
Median length12
Mean length10.419558
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정류소(버스,택시 등)
2nd row정류소(버스,택시 등)
3rd row정류소(버스,택시 등)
4th row정류소(버스,택시 등)
5th row정류소(버스,택시 등)

Common Values

ValueCountFrequency (%)
정류소(버스,택시 등) 236
74.4%
기타 35
 
11.0%
지하철역 입구 27
 
8.5%
도로변(횡단보도 포함) 17
 
5.4%
<NA> 2
 
0.6%

Length

2024-04-06T17:33:10.724138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:33:10.947910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정류소(버스,택시 236
39.5%
236
39.5%
기타 35
 
5.9%
지하철역 27
 
4.5%
입구 27
 
4.5%
도로변(횡단보도 17
 
2.8%
포함 17
 
2.8%
na 2
 
0.3%

쓰레기통 형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
일반 사각 쓰레기통
305 
항아리형 쓰레기통
 
10
<NA>
 
2

Length

Max length10
Median length10
Mean length9.9305994
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반 사각 쓰레기통
2nd row일반 사각 쓰레기통
3rd row일반 사각 쓰레기통
4th row일반 사각 쓰레기통
5th row일반 사각 쓰레기통

Common Values

ValueCountFrequency (%)
일반 사각 쓰레기통 305
96.2%
항아리형 쓰레기통 10
 
3.2%
<NA> 2
 
0.6%

Length

2024-04-06T17:33:11.222163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:33:11.497083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
쓰레기통 315
33.6%
일반 305
32.6%
사각 305
32.6%
항아리형 10
 
1.1%
na 2
 
0.2%

수거쓰레기 종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
일반쓰레기
261 
재활용쓰레기
54 
<NA>
 
2

Length

Max length6
Median length5
Mean length5.1640379
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반쓰레기
2nd row일반쓰레기
3rd row일반쓰레기
4th row재활용쓰레기
5th row일반쓰레기

Common Values

ValueCountFrequency (%)
일반쓰레기 261
82.3%
재활용쓰레기 54
 
17.0%
<NA> 2
 
0.6%

Length

2024-04-06T17:33:11.838408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:33:12.094808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반쓰레기 261
82.3%
재활용쓰레기 54
 
17.0%
na 2
 
0.6%

전화번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.6 KiB
02-2600-4067
315 
<NA>
 
2

Length

Max length12
Median length12
Mean length11.949527
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row02-2600-4067
2nd row02-2600-4067
3rd row02-2600-4067
4th row02-2600-4067
5th row02-2600-4067

Common Values

ValueCountFrequency (%)
02-2600-4067 315
99.4%
<NA> 2
 
0.6%

Length

2024-04-06T17:33:12.349709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:33:12.626039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
02-2600-4067 315
99.4%
na 2
 
0.6%

Interactions

2024-04-06T17:33:05.672719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:33:12.764947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번설치장소유형쓰레기통 형태수거쓰레기 종류
연번1.0000.5490.6670.533
설치장소유형0.5491.0000.5640.292
쓰레기통 형태0.6670.5641.0000.190
수거쓰레기 종류0.5330.2920.1901.000
2024-04-06T17:33:13.037707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
자치구명설치장소유형쓰레기통 형태전화번호수거쓰레기 종류
자치구명1.0001.0001.0001.0001.000
설치장소유형1.0001.0000.3861.0000.194
쓰레기통 형태1.0000.3861.0001.0000.122
전화번호1.0001.0001.0001.0001.000
수거쓰레기 종류1.0000.1940.1221.0001.000
2024-04-06T17:33:13.370645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번자치구명설치장소유형쓰레기통 형태수거쓰레기 종류전화번호
연번1.0001.0000.3580.5120.4061.000
자치구명1.0001.0001.0001.0001.0001.000
설치장소유형0.3581.0001.0000.3860.1941.000
쓰레기통 형태0.5121.0000.3861.0000.1221.000
수거쓰레기 종류0.4061.0000.1940.1221.0001.000
전화번호1.0001.0001.0001.0001.0001.000

Missing values

2024-04-06T17:33:05.955458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:33:06.195929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T17:33:06.475976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번자치구명설치위치(도로명주소)세부위치설치장소유형쓰레기통 형태수거쓰레기 종류전화번호
01서울특별시 강서구가로공원로 지하189나누리병원 16-199정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
12서울특별시 강서구가로공원로 지하189나누리병원 16-200정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
23서울특별시 강서구강서로 지하54까치산역 16-192정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
34서울특별시 강서구강서로 지하54까치산역 16-192정류소(버스,택시 등)일반 사각 쓰레기통재활용쓰레기02-2600-4067
45서울특별시 강서구화곡로 지하 168화곡역.현대주유소 16-205정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
56서울특별시 강서구강서로239화곡중고등학교16-211정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
67서울특별시 강서구강서로242화곡중고등학교16-212정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
78서울특별시 강서구강서로 지하262우장산역 1번출구16-213정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
89서울특별시 강서구강서로307명덕고등학교16-216정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
910서울특별시 강서구강서로349발산역힐스테이트16-218정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
연번자치구명설치위치(도로명주소)세부위치설치장소유형쓰레기통 형태수거쓰레기 종류전화번호
307308서울특별시 강서구양천로49길 7양천초등학교.겸재정선미술관(16-269)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
308309서울특별시 강서구마곡동813한국도레이 R&D센터 앞도로변(횡단보도 포함)일반 사각 쓰레기통일반쓰레기02-2600-4067
309310서울특별시 강서구방화대로34길 13서울항공비즈니스고등학교 앞도로변(횡단보도 포함)일반 사각 쓰레기통일반쓰레기02-2600-4067
310311서울특별시 강서구방화대로 319서울항공비즈니스고등학교(16-306)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
311312서울특별시 강서구방화대로 360신방화사거리(16-431)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
312313서울특별시 강서구양천로24길 13신방화사거리(16-432)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
313314서울특별시 강서구양천로125방화동도시개발1,2단지아파트(16-304)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
314315서울특별시 강서구방화대로47길 9강서공업고등학교(16-303)정류소(버스,택시 등)일반 사각 쓰레기통일반쓰레기02-2600-4067
315<NA><NA><NA><NA><NA><NA><NA><NA>
316<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번자치구명설치위치(도로명주소)세부위치설치장소유형쓰레기통 형태수거쓰레기 종류전화번호# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>2