Overview

Dataset statistics

Number of variables5
Number of observations241
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.8 KiB
Average record size in memory41.5 B

Variable types

Numeric1
Categorical2
Text1
DateTime1

Dataset

Description서울시 영등포구 담배꽁초 전용 스레기통(꽁초픽 포함) 설치 현황입니다. 제공 데이터: 연번, 동명, 설치위치, 비고 등
Author서울특별시 영등포구
URLhttps://www.data.go.kr/data/15103114/fileData.do

Alerts

데이터기준일 has constant value ""Constant
동명 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 동명 and 1 other fieldsHigh correlation
비고 is highly imbalanced (75.0%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:51:03.931730
Analysis finished2023-12-12 17:51:04.560902
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct241
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean121
Minimum1
Maximum241
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-13T02:51:04.675323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13
Q161
median121
Q3181
95-th percentile229
Maximum241
Range240
Interquartile range (IQR)120

Descriptive statistics

Standard deviation69.714896
Coefficient of variation (CV)0.57615616
Kurtosis-1.2
Mean121
Median Absolute Deviation (MAD)60
Skewness0
Sum29161
Variance4860.1667
MonotonicityStrictly increasing
2023-12-13T02:51:04.867628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
182 1
 
0.4%
154 1
 
0.4%
155 1
 
0.4%
156 1
 
0.4%
157 1
 
0.4%
158 1
 
0.4%
159 1
 
0.4%
160 1
 
0.4%
161 1
 
0.4%
Other values (231) 231
95.9%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
241 1
0.4%
240 1
0.4%
239 1
0.4%
238 1
0.4%
237 1
0.4%
236 1
0.4%
235 1
0.4%
234 1
0.4%
233 1
0.4%
232 1
0.4%

동명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
대림1동
53 
영등포동
44 
문래동
29 
당산2동
21 
당산1동
15 
Other values (12)
79 

Length

Max length5
Median length4
Mean length3.9253112
Min length3

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row당산1동
2nd row당산1동
3rd row당산1동
4th row당산1동
5th row당산1동

Common Values

ValueCountFrequency (%)
대림1동 53
22.0%
영등포동 44
18.3%
문래동 29
12.0%
당산2동 21
 
8.7%
당산1동 15
 
6.2%
대림2동 15
 
6.2%
신길5동 14
 
5.8%
대림3동 8
 
3.3%
신길6동 7
 
2.9%
신길1동 7
 
2.9%
Other values (7) 28
11.6%

Length

2023-12-13T02:51:05.075622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대림1동 53
22.0%
영등포동 44
18.3%
문래동 29
12.0%
당산2동 21
 
8.7%
당산1동 15
 
6.2%
대림2동 15
 
6.2%
신길5동 14
 
5.8%
대림3동 8
 
3.3%
신길1동 7
 
2.9%
신길6동 7
 
2.9%
Other values (7) 28
11.6%

주소
Text

Distinct206
Distinct (%)85.5%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-13T02:51:05.464589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length38
Mean length23.062241
Min length16

Characters and Unicode

Total characters5558
Distinct characters248
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique185 ?
Unique (%)76.8%

Sample

1st row서울특별시 영등포구 당산로32길 1-6
2nd row서울특별시 영등포구 국회대로36길 7-3
3rd row서울특별시 영등포구 당산로32길 5 이차돌
4th row서울특별시 영등포구 국회대로34길 4 가화
5th row서울특별시 영등포구 국회대로34길 3
ValueCountFrequency (%)
서울특별시 243
21.8%
영등포구 241
21.6%
19
 
1.7%
대림로 18
 
1.6%
디지털로 18
 
1.6%
2 14
 
1.3%
도림로 9
 
0.8%
4 9
 
0.8%
신풍로 8
 
0.7%
신길로 7
 
0.6%
Other values (334) 531
47.5%
2023-12-13T02:51:06.412491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
892
 
16.0%
293
 
5.3%
269
 
4.8%
266
 
4.8%
253
 
4.6%
247
 
4.4%
245
 
4.4%
243
 
4.4%
243
 
4.4%
242
 
4.4%
Other values (238) 2365
42.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3723
67.0%
Space Separator 892
 
16.0%
Decimal Number 825
 
14.8%
Dash Punctuation 48
 
0.9%
Close Punctuation 30
 
0.5%
Open Punctuation 30
 
0.5%
Uppercase Letter 8
 
0.1%
Other Punctuation 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
293
 
7.9%
269
 
7.2%
266
 
7.1%
253
 
6.8%
247
 
6.6%
245
 
6.6%
243
 
6.5%
243
 
6.5%
242
 
6.5%
239
 
6.4%
Other values (216) 1183
31.8%
Decimal Number
ValueCountFrequency (%)
1 179
21.7%
2 112
13.6%
3 111
13.5%
4 102
12.4%
5 66
 
8.0%
9 60
 
7.3%
6 52
 
6.3%
7 50
 
6.1%
8 50
 
6.1%
0 43
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
U 2
25.0%
S 2
25.0%
G 1
12.5%
R 1
12.5%
T 1
12.5%
C 1
12.5%
Space Separator
ValueCountFrequency (%)
892
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%
Math Symbol
ValueCountFrequency (%)
× 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3723
67.0%
Common 1827
32.9%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
293
 
7.9%
269
 
7.2%
266
 
7.1%
253
 
6.8%
247
 
6.6%
245
 
6.6%
243
 
6.5%
243
 
6.5%
242
 
6.5%
239
 
6.4%
Other values (216) 1183
31.8%
Common
ValueCountFrequency (%)
892
48.8%
1 179
 
9.8%
2 112
 
6.1%
3 111
 
6.1%
4 102
 
5.6%
5 66
 
3.6%
9 60
 
3.3%
6 52
 
2.8%
7 50
 
2.7%
8 50
 
2.7%
Other values (6) 153
 
8.4%
Latin
ValueCountFrequency (%)
U 2
25.0%
S 2
25.0%
G 1
12.5%
R 1
12.5%
T 1
12.5%
C 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3723
67.0%
ASCII 1834
33.0%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
892
48.6%
1 179
 
9.8%
2 112
 
6.1%
3 111
 
6.1%
4 102
 
5.6%
5 66
 
3.6%
9 60
 
3.3%
6 52
 
2.8%
7 50
 
2.7%
8 50
 
2.7%
Other values (11) 160
 
8.7%
Hangul
ValueCountFrequency (%)
293
 
7.9%
269
 
7.2%
266
 
7.1%
253
 
6.8%
247
 
6.6%
245
 
6.6%
243
 
6.5%
243
 
6.5%
242
 
6.5%
239
 
6.4%
Other values (216) 1183
31.8%
None
ValueCountFrequency (%)
× 1
100.0%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
<NA>
226 
꽁초픽
 
9
동 자체보관
 
6

Length

Max length6
Median length4
Mean length4.0124481
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 226
93.8%
꽁초픽 9
 
3.7%
동 자체보관 6
 
2.5%

Length

2023-12-13T02:51:06.596684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:51:06.741444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 226
91.5%
꽁초픽 9
 
3.6%
6
 
2.4%
자체보관 6
 
2.4%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
Minimum2022-08-03 00:00:00
Maximum2022-08-03 00:00:00
2023-12-13T02:51:06.879421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:51:07.025167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T02:51:04.231085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:51:07.174527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번동명비고
연번1.0000.9351.000
동명0.9351.0001.000
비고1.0001.0001.000
2023-12-13T02:51:07.307949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
동명비고
동명1.0000.920
비고0.9201.000
2023-12-13T02:51:07.435742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번동명비고
연번1.0000.7210.920
동명0.7211.0000.920
비고0.9200.9201.000

Missing values

2023-12-13T02:51:04.382978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:51:04.509969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번동명주소비고데이터기준일
01당산1동서울특별시 영등포구 당산로32길 1-6<NA>2022-08-03
12당산1동서울특별시 영등포구 국회대로36길 7-3<NA>2022-08-03
23당산1동서울특별시 영등포구 당산로32길 5 이차돌<NA>2022-08-03
34당산1동서울특별시 영등포구 국회대로34길 4 가화<NA>2022-08-03
45당산1동서울특별시 영등포구 국회대로34길 3<NA>2022-08-03
56당산1동서울특별시 영등포구 당산로31길 4<NA>2022-08-03
67당산1동서울특별시 영등포구 당산로36길 14<NA>2022-08-03
78당산1동서울특별시 영등포구 당산로36길 14<NA>2022-08-03
89당산1동서울특별시 영등포구 양산로23길 17<NA>2022-08-03
910당산1동서울특별시 영등포구 양산로23길 11동 자체보관2022-08-03
연번동명주소비고데이터기준일
231232영등포동서울특별시 영등포구 영등포로35가길 5(로젠빌)<NA>2022-08-03
232233영등포동서울특별시 영등포구 영등포로33길 15-1<NA>2022-08-03
233234영등포동서울특별시 영등포구 영등포로33길 17<NA>2022-08-03
234235영등포동서울특별시 영등포구 영중로 65 동방미식성<NA>2022-08-03
235236영등포동서울특별시 영등포구 영등포로42길 21-2<NA>2022-08-03
236237영등포동서울특별시 영등포구 경인로112길 4<NA>2022-08-03
237238영등포본동서울특별시 영등포구 영신로15길 7 현대기술학원 정문<NA>2022-08-03
238239영등포본동서울특별시 영등포구 영신로15길 7 현대기술학원 후문<NA>2022-08-03
239240영등포본동서울특별시 영등포구 영신로17길<NA>2022-08-03
240241영등포본동서울특별시 영등포구 도신로29길 28<NA>2022-08-03