Overview

Dataset statistics

Number of variables8
Number of observations280
Missing cells281
Missing cells (%)12.5%
Duplicate rows22
Duplicate rows (%)7.9%
Total size in memory17.9 KiB
Average record size in memory65.5 B

Variable types

Unsupported7
Text1

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11573/S/1/datasetView.do

Alerts

Dataset has 22 (7.9%) duplicate rowsDuplicates
Unnamed: 0 has 280 (100.0%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported
승강기 총괄대수(건물측 포함) is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 05:46:50.637834
Analysis finished2023-12-11 05:46:50.981526
Duration0.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing280
Missing (%)100.0%
Memory size2.6 KiB

승강기 총괄대수(건물측 포함)
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB
Distinct257
Distinct (%)92.1%
Missing1
Missing (%)0.4%
Memory size2.3 KiB
2023-12-11T14:46:51.226513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length3.1756272
Min length2

Characters and Unicode

Total characters886
Distinct characters216
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique235 ?
Unique (%)84.2%

Sample

1st row역명
2nd row서울역(1)
3rd row시청(1)
4th row종각
5th row종로3가(1)
ValueCountFrequency (%)
공덕 2
 
0.7%
을지로4가 2
 
0.7%
군자 2
 
0.7%
합정 2
 
0.7%
오금 2
 
0.7%
삼각지 2
 
0.7%
가락시장 2
 
0.7%
충정로 2
 
0.7%
왕십리 2
 
0.7%
천호 2
 
0.7%
Other values (247) 259
92.8%
2023-12-11T14:46:51.657208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33
 
3.7%
28
 
3.2%
25
 
2.8%
24
 
2.7%
) 20
 
2.3%
( 20
 
2.3%
19
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
Other values (206) 671
75.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 818
92.3%
Decimal Number 28
 
3.2%
Close Punctuation 20
 
2.3%
Open Punctuation 20
 
2.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
4.0%
28
 
3.4%
25
 
3.1%
24
 
2.9%
19
 
2.3%
16
 
2.0%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (199) 615
75.2%
Decimal Number
ValueCountFrequency (%)
3 8
28.6%
4 7
25.0%
1 6
21.4%
2 6
21.4%
5 1
 
3.6%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 818
92.3%
Common 68
 
7.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
 
4.0%
28
 
3.4%
25
 
3.1%
24
 
2.9%
19
 
2.3%
16
 
2.0%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (199) 615
75.2%
Common
ValueCountFrequency (%)
) 20
29.4%
( 20
29.4%
3 8
 
11.8%
4 7
 
10.3%
1 6
 
8.8%
2 6
 
8.8%
5 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 818
92.3%
ASCII 68
 
7.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
33
 
4.0%
28
 
3.4%
25
 
3.1%
24
 
2.9%
19
 
2.3%
16
 
2.0%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (199) 615
75.2%
ASCII
ValueCountFrequency (%)
) 20
29.4%
( 20
29.4%
3 8
 
11.8%
4 7
 
10.3%
1 6
 
8.8%
2 6
 
8.8%
5 1
 
1.5%

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Unnamed: 4
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Unnamed: 6
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Unnamed: 7
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Missing values

2023-12-11T14:46:50.802500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T14:46:50.923433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0승강기 총괄대수(건물측 포함)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7
0<NA>호선역명엘리베이터에스컬레이터무빙워크휠체어리프트합계
1<NA>총 합계<NA>8071733201612721
2<NA>1서울역(1)43018
3<NA>1시청(1)33006
4<NA>1종각42006
5<NA>1종로3가(1)34007
6<NA>1종로5가30003
7<NA>1동대문(1)31004
8<NA>1동묘앞7120019
9<NA>1신설동(1)500611
Unnamed: 0승강기 총괄대수(건물측 포함)Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7
270<NA>8복정13015
271<NA>8산성4130017
272<NA>8남한산성입구24028
273<NA>8단대오거리34007
274<NA>8신흥20002
275<NA>8수진20013
276<NA>8모란32027
277<NA>6(기타)성산별관10001
278<NA>7(기타)천왕기지10001
279<NA>기타대공원어린이집00011

Duplicate rows

Most frequently occurring

Unnamed: 2# duplicates
0가락시장2
1건대입구2
2고속터미널2
3공덕2
4군자2
5노원2
6대림2
7동묘앞2
8불광2
9삼각지2