Overview

Dataset statistics

Number of variables4
Number of observations6537
Missing cells33
Missing cells (%)0.1%
Duplicate rows499
Duplicate rows (%)7.6%
Total size in memory204.4 KiB
Average record size in memory32.0 B

Variable types

Unsupported2
Categorical1
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15069/F/1/datasetView.do

Alerts

Dataset has 499 (7.6%) duplicate rowsDuplicates
2020년 가로쓰레기통(고정식) 설치 현황 (19.9. 기준) is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-13 06:32:33.151416
Analysis finished2024-03-13 06:32:33.569174
Duration0.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Missing1
Missing (%)< 0.1%
Memory size51.2 KiB

Unnamed: 1
Categorical

Distinct27
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size51.2 KiB
강남구
960 
강동구
436 
중구
 
360
구로구
 
325
종로구
 
304
Other values (22)
4152 

Length

Max length4
Median length3
Mean length3.0439039
Min length2

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row자치구명
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
강남구 960
 
14.7%
강동구 436
 
6.7%
중구 360
 
5.5%
구로구 325
 
5.0%
종로구 304
 
4.7%
송파구 301
 
4.6%
은평구 295
 
4.5%
서대문구 290
 
4.4%
강서구 274
 
4.2%
마포구 274
 
4.2%
Other values (17) 2718
41.6%

Length

2024-03-13T15:32:33.633949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강남구 960
 
14.7%
강동구 436
 
6.7%
중구 360
 
5.5%
구로구 325
 
5.0%
종로구 304
 
4.7%
송파구 301
 
4.6%
은평구 295
 
4.5%
서대문구 290
 
4.4%
강서구 274
 
4.2%
마포구 274
 
4.2%
Other values (17) 2718
41.6%
Distinct891
Distinct (%)13.7%
Missing31
Missing (%)0.5%
Memory size51.2 KiB
2024-03-13T15:32:33.887422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22
Mean length4.0936059
Min length2

Characters and Unicode

Total characters26633
Distinct characters290
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique444 ?
Unique (%)6.8%

Sample

1st row도로(가로)명
2nd row사직로
3rd row자하문로
4th row자하문로
5th row자하문로
ValueCountFrequency (%)
천호대로 199
 
2.8%
남부순환로 154
 
2.2%
올림픽로 109
 
1.5%
시청가로 108
 
1.5%
영동대로 106
 
1.5%
신당가로 102
 
1.5%
강남대로 100
 
1.4%
도봉로 95
 
1.4%
통일로 94
 
1.3%
삼성로 80
 
1.1%
Other values (924) 5886
83.7%
2024-03-13T15:32:34.253170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6283
23.6%
1086
 
4.1%
691
 
2.6%
678
 
2.5%
486
 
1.8%
433
 
1.6%
1 423
 
1.6%
378
 
1.4%
370
 
1.4%
339
 
1.3%
Other values (280) 15466
58.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23615
88.7%
Decimal Number 2104
 
7.9%
Space Separator 678
 
2.5%
Dash Punctuation 71
 
0.3%
Close Punctuation 43
 
0.2%
Open Punctuation 43
 
0.2%
Uppercase Letter 39
 
0.1%
Control 20
 
0.1%
Other Punctuation 20
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6283
26.6%
1086
 
4.6%
691
 
2.9%
486
 
2.1%
433
 
1.8%
378
 
1.6%
370
 
1.6%
339
 
1.4%
338
 
1.4%
306
 
1.3%
Other values (260) 12905
54.6%
Decimal Number
ValueCountFrequency (%)
1 423
20.1%
2 283
13.5%
5 218
10.4%
7 188
8.9%
3 188
8.9%
6 187
8.9%
4 165
 
7.8%
0 163
 
7.7%
9 151
 
7.2%
8 138
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
D 19
48.7%
I 19
48.7%
R 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
: 19
95.0%
. 1
 
5.0%
Space Separator
ValueCountFrequency (%)
678
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Control
ValueCountFrequency (%)
20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23615
88.7%
Common 2979
 
11.2%
Latin 39
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6283
26.6%
1086
 
4.6%
691
 
2.9%
486
 
2.1%
433
 
1.8%
378
 
1.6%
370
 
1.6%
339
 
1.4%
338
 
1.4%
306
 
1.3%
Other values (260) 12905
54.6%
Common
ValueCountFrequency (%)
678
22.8%
1 423
14.2%
2 283
9.5%
5 218
 
7.3%
7 188
 
6.3%
3 188
 
6.3%
6 187
 
6.3%
4 165
 
5.5%
0 163
 
5.5%
9 151
 
5.1%
Other values (7) 335
11.2%
Latin
ValueCountFrequency (%)
D 19
48.7%
I 19
48.7%
R 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23615
88.7%
ASCII 3018
 
11.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6283
26.6%
1086
 
4.6%
691
 
2.9%
486
 
2.1%
433
 
1.8%
378
 
1.6%
370
 
1.6%
339
 
1.4%
338
 
1.4%
306
 
1.3%
Other values (260) 12905
54.6%
ASCII
ValueCountFrequency (%)
678
22.5%
1 423
14.0%
2 283
9.4%
5 218
 
7.2%
7 188
 
6.2%
3 188
 
6.2%
6 187
 
6.2%
4 165
 
5.5%
0 163
 
5.4%
9 151
 
5.0%
Other values (10) 374
12.4%

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)< 0.1%
Memory size51.2 KiB

Missing values

2024-03-13T15:32:33.338257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T15:32:33.419432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T15:32:33.511890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

2020년 가로쓰레기통(고정식) 설치 현황 (19.9. 기준)Unnamed: 1Unnamed: 2Unnamed: 3
0NaN<NA><NA>NaN
1연번자치구명도로(가로)명설치위치
21종로구사직로경복궁역 4번출구
32종로구자하문로자하문로 12
43종로구자하문로자하문로 44
54종로구자하문로효자동 \n정류소 ID : 01-112
65종로구효자로청와대 분수대\n(사랑채)
76종로구자하문로경복궁역 \n정류소 ID : 01-115
87종로구새문안로새문안로 46 씨티은행
98종로구신문로신문로2가 82 에스타워
2020년 가로쓰레기통(고정식) 설치 현황 (19.9. 기준)Unnamed: 1Unnamed: 2Unnamed: 3
6527427강동구올림픽로797 앞 도로
6528428강동구올림픽로809 앞 도로
6529429강동구올림픽로813 앞 도로
6530430강동구올림픽로암사역 1번 출구
6531431강동구고덕로62길55 주양쇼핑 앞
6532432강동구아리수로426 공영차고지 앞
6533433강동구아리수로419 리슈빌오피스텔 앞
6534434강동구강동대로207 윤선생 앞
6535435강동구천호대로1092
6536436강동구천호대로1027 신한은행 앞

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2# duplicates
50강동구천호대로123
493중구시청가로108
21강남구영동대로106
494중구신당가로102
15강남구삼성로80
495중구중부가로78
482종로구종로73
24강남구테헤란로72
492중구서울역가로72
14강남구봉은사로70