Overview

Dataset statistics

Number of variables8
Number of observations52
Missing cells62
Missing cells (%)14.9%
Duplicate rows2
Duplicate rows (%)3.8%
Total size in memory3.4 KiB
Average record size in memory66.5 B

Variable types

Unsupported5
Text3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12855/F/1/datasetView.do

Alerts

Dataset has 2 (3.8%) duplicate rowsDuplicates
Unnamed: 0 has 10 (19.2%) missing valuesMissing
Unnamed: 1 has 10 (19.2%) missing valuesMissing
도시대기측정소 has 9 (17.3%) missing valuesMissing
Unnamed: 3 has 9 (17.3%) missing valuesMissing
Unnamed: 4 has 6 (11.5%) missing valuesMissing
Unnamed: 5 has 6 (11.5%) missing valuesMissing
Unnamed: 6 has 6 (11.5%) missing valuesMissing
Unnamed: 7 has 6 (11.5%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 09:55:43.724465
Analysis finished2023-12-11 09:55:44.420013
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)19.2%
Memory size548.0 B

Unnamed: 1
Text

MISSING 

Distinct41
Distinct (%)97.6%
Missing10
Missing (%)19.2%
Memory size548.0 B
2023-12-11T18:55:44.592691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.1666667
Min length2

Characters and Unicode

Total characters133
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)95.2%

Sample

1st row측정소
2nd row종로구
3rd row중구
4th row용산구
5th row성동구
ValueCountFrequency (%)
측정소 2
 
4.2%
2
 
4.2%
서울역 1
 
2.1%
관악구 1
 
2.1%
서초구 1
 
2.1%
강남구 1
 
2.1%
송파구 1
 
2.1%
1
 
2.1%
청계4가 1
 
2.1%
영등포 1
 
2.1%
Other values (36) 36
75.0%
2023-12-11T18:55:44.931385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
19.5%
7
 
5.3%
7
 
5.3%
6
 
4.5%
5
 
3.8%
4
 
3.0%
4
 
3.0%
3
 
2.3%
3
 
2.3%
3
 
2.3%
Other values (52) 65
48.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 126
94.7%
Space Separator 6
 
4.5%
Decimal Number 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
20.6%
7
 
5.6%
7
 
5.6%
5
 
4.0%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
Other values (50) 62
49.2%
Space Separator
ValueCountFrequency (%)
6
100.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 126
94.7%
Common 7
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
20.6%
7
 
5.6%
7
 
5.6%
5
 
4.0%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
Other values (50) 62
49.2%
Common
ValueCountFrequency (%)
6
85.7%
4 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 126
94.7%
ASCII 7
 
5.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
20.6%
7
 
5.6%
7
 
5.6%
5
 
4.0%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
3
 
2.4%
2
 
1.6%
Other values (50) 62
49.2%
ASCII
ValueCountFrequency (%)
6
85.7%
4 1
 
14.3%

도시대기측정소
Text

MISSING 

Distinct42
Distinct (%)97.7%
Missing9
Missing (%)17.3%
Memory size548.0 B
2023-12-11T18:55:45.248145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length28
Mean length19.976744
Min length3

Characters and Unicode

Total characters859
Distinct characters137
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)95.3%

Sample

1st row신번지
2nd row종로구 삼봉로 43 (종로5․6가주민센터)
3rd row중구 덕수궁길 15 (시청서소문별관 3층)
4th row용산구 한남대로 136 (한남직업전문학교 본관)
5th row성동구 서울숲7길 (서울숲 방문자센터 옆)
ValueCountFrequency (%)
주민센터 9
 
5.4%
중구 3
 
1.8%
신번지 2
 
1.2%
노원구 2
 
1.2%
강남구 2
 
1.2%
성동구 2
 
1.2%
영등포구 2
 
1.2%
용산구 2
 
1.2%
43 2
 
1.2%
동작구 2
 
1.2%
Other values (133) 140
83.3%
2023-12-11T18:55:45.710942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
127
 
14.8%
47
 
5.5%
37
 
4.3%
1 34
 
4.0%
31
 
3.6%
) 26
 
3.0%
( 26
 
3.0%
2 24
 
2.8%
4 21
 
2.4%
20
 
2.3%
Other values (127) 466
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 508
59.1%
Decimal Number 158
 
18.4%
Space Separator 127
 
14.8%
Close Punctuation 26
 
3.0%
Open Punctuation 26
 
3.0%
Dash Punctuation 11
 
1.3%
Other Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
47
 
9.3%
37
 
7.3%
31
 
6.1%
20
 
3.9%
15
 
3.0%
15
 
3.0%
12
 
2.4%
12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (111) 297
58.5%
Decimal Number
ValueCountFrequency (%)
1 34
21.5%
2 24
15.2%
4 21
13.3%
3 18
11.4%
6 16
10.1%
5 14
8.9%
7 12
 
7.6%
0 7
 
4.4%
8 6
 
3.8%
9 6
 
3.8%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
127
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 508
59.1%
Common 351
40.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
47
 
9.3%
37
 
7.3%
31
 
6.1%
20
 
3.9%
15
 
3.0%
15
 
3.0%
12
 
2.4%
12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (111) 297
58.5%
Common
ValueCountFrequency (%)
127
36.2%
1 34
 
9.7%
) 26
 
7.4%
( 26
 
7.4%
2 24
 
6.8%
4 21
 
6.0%
3 18
 
5.1%
6 16
 
4.6%
5 14
 
4.0%
7 12
 
3.4%
Other values (6) 33
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 508
59.1%
ASCII 350
40.7%
Punctuation 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
127
36.3%
1 34
 
9.7%
) 26
 
7.4%
( 26
 
7.4%
2 24
 
6.9%
4 21
 
6.0%
3 18
 
5.1%
6 16
 
4.6%
5 14
 
4.0%
7 12
 
3.4%
Other values (5) 32
 
9.1%
Hangul
ValueCountFrequency (%)
47
 
9.3%
37
 
7.3%
31
 
6.1%
20
 
3.9%
15
 
3.0%
15
 
3.0%
12
 
2.4%
12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (111) 297
58.5%
Punctuation
ValueCountFrequency (%)
1
100.0%

Unnamed: 3
Text

MISSING 

Distinct42
Distinct (%)97.7%
Missing9
Missing (%)17.3%
Memory size548.0 B
2023-12-11T18:55:46.042122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length26
Mean length21.790698
Min length3

Characters and Unicode

Total characters937
Distinct characters151
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)95.3%

Sample

1st row2015.03.25일 현재
2nd row구번지
3rd row종로구 효제동 173-2 (종로5~6가주민센터)
4th row중구 서소문동 37번지 (시청별관 3동)
5th row용산구 한남2동 726-366 (한남직업전문학교)
ValueCountFrequency (%)
중구 3
 
1.8%
구번지 2
 
1.2%
강동구 2
 
1.2%
노원구 2
 
1.2%
용산구 2
 
1.2%
영등포구 2
 
1.2%
동작구 2
 
1.2%
동대문구 2
 
1.2%
마포구 2
 
1.2%
종로구 2
 
1.2%
Other values (138) 145
87.3%
2023-12-11T18:55:46.523728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
124
 
13.2%
48
 
5.1%
47
 
5.0%
( 38
 
4.1%
) 38
 
4.1%
1 31
 
3.3%
2 22
 
2.3%
22
 
2.3%
3 21
 
2.2%
- 20
 
2.1%
Other values (141) 526
56.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 542
57.8%
Decimal Number 172
 
18.4%
Space Separator 124
 
13.2%
Open Punctuation 38
 
4.1%
Close Punctuation 38
 
4.1%
Dash Punctuation 20
 
2.1%
Other Punctuation 2
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
8.9%
47
 
8.7%
22
 
4.1%
15
 
2.8%
13
 
2.4%
12
 
2.2%
12
 
2.2%
12
 
2.2%
11
 
2.0%
11
 
2.0%
Other values (125) 339
62.5%
Decimal Number
ValueCountFrequency (%)
1 31
18.0%
2 22
12.8%
3 21
12.2%
4 19
11.0%
5 19
11.0%
0 15
8.7%
6 13
7.6%
8 12
 
7.0%
9 10
 
5.8%
7 10
 
5.8%
Space Separator
ValueCountFrequency (%)
124
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 542
57.8%
Common 395
42.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
8.9%
47
 
8.7%
22
 
4.1%
15
 
2.8%
13
 
2.4%
12
 
2.2%
12
 
2.2%
12
 
2.2%
11
 
2.0%
11
 
2.0%
Other values (125) 339
62.5%
Common
ValueCountFrequency (%)
124
31.4%
( 38
 
9.6%
) 38
 
9.6%
1 31
 
7.8%
2 22
 
5.6%
3 21
 
5.3%
- 20
 
5.1%
4 19
 
4.8%
5 19
 
4.8%
0 15
 
3.8%
Other values (6) 48
 
12.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 542
57.8%
ASCII 394
42.0%
None 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
124
31.5%
( 38
 
9.6%
) 38
 
9.6%
1 31
 
7.9%
2 22
 
5.6%
3 21
 
5.3%
- 20
 
5.1%
4 19
 
4.8%
5 19
 
4.8%
0 15
 
3.8%
Other values (5) 47
 
11.9%
Hangul
ValueCountFrequency (%)
48
 
8.9%
47
 
8.7%
22
 
4.1%
15
 
2.8%
13
 
2.4%
12
 
2.2%
12
 
2.2%
12
 
2.2%
11
 
2.0%
11
 
2.0%
Other values (125) 339
62.5%
None
ValueCountFrequency (%)
1
100.0%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)11.5%
Memory size548.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)11.5%
Memory size548.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)11.5%
Memory size548.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6
Missing (%)11.5%
Memory size548.0 B

Correlations

2023-12-11T18:55:46.620959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1도시대기측정소Unnamed: 3
Unnamed: 11.0001.0001.000
도시대기측정소1.0001.0001.000
Unnamed: 31.0001.0001.000

Missing values

2023-12-11T18:55:44.016994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T18:55:44.152884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T18:55:44.301992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0Unnamed: 1도시대기측정소Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7
0NaN<NA><NA>2015.03.25일 현재NaNNaNNaNNaN
1번호측정소신번지구번지표고점\n(m)측정소\n높이(m)채취구\n높이(m)해발채취구 높이(m)
21종로구종로구 삼봉로 43 (종로5․6가주민센터)종로구 효제동 173-2 (종로5~6가주민센터)21.414.54.840.7
32중구중구 덕수궁길 15 (시청서소문별관 3층)중구 서소문동 37번지 (시청별관 3동)32.3144.851.1
43용산구용산구 한남대로 136 (한남직업전문학교 본관)용산구 한남2동 726-366 (한남직업전문학교)38.410.54.453.3
54성동구성동구 서울숲7길 (서울숲 방문자센터 옆)성동구 성수동 성수1가 685-20 (서울숲)13.50518.5
65광진구광진구 광나루로 570 (구의아리수정수센터내)광진구 광장동 520-9 (구의아리수정수센터)38.45.54.848.7
76동대문구동대문구 천호대로13길 43 (용두초등학교)동대문구 용두2동 237-1 (용두초등학교)16.510.55.532.5
87중랑구중랑구 용마산로 369 (건강가정지원센터)중랑구 면목8동 62-2(건강가정지원센터)38.2104.853
98성북구성북구 삼양로2길 70 (길음2동 주민센터)성북구 길음3동 1064-1 (길음3동주민센터)35.910.65.451.9
Unnamed: 0Unnamed: 1도시대기측정소Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7
428종 로종로구 종로4가 32-8도종로구 종로 169(종묘공원앞)290.51.531
439길 동강동구 길동 426-1도강동구 천호대로 1151(길동사거리)280.51.530
4410태 능노원구 공능동 678-1도노원구 화랑로 429(태능입구역 8번출구)19.70.52.522.7
4511공항로강서구 가양동 803도강서구 공황대로 271(마곡역 앞)110.52.514
4612강변북로성동구 성수동 642-25성동구 강변북로 257 (한강사업본부내)19.60.52.522.6
4713내부순환로성북구 정릉3동 998도성북구 정릉로 49 (내부순환로 내선 정릉램프 시점)76152.793.7
4814양 재서초구 양재1동 19-14도서초구 강남대로 (서초구민회관 앞 버스중앙차로승강장)27.20.52.530.2
4915동작대로동작구 사당2동 739도동작구 동작대로 145 (4호선이수역 북단 버스중앙차로승강장)110.32.714
50NaN<NA><NA><NA>지면측정소\n건물바닥채취구채취구
51NaN<NA><NA><NA>해면지면측정소\n건물바닥해면

Duplicate rows

Most frequently occurring

Unnamed: 1도시대기측정소Unnamed: 3# duplicates
1<NA><NA><NA>8
0측정소신번지구번지2