Overview

Dataset statistics

Number of variables6
Number of observations250
Missing cells733
Missing cells (%)48.9%
Duplicate rows21
Duplicate rows (%)8.4%
Total size in memory12.1 KiB
Average record size in memory49.5 B

Variable types

Unsupported5
Text1

Dataset

Description파일 다운로드
Author서울 교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13242/F/1/datasetView.do

Alerts

Dataset has 21 (8.4%) duplicate rowsDuplicates
◎ 에스컬레이터 역별 대수 has 240 (96.0%) missing valuesMissing
Unnamed: 4 has 241 (96.4%) missing valuesMissing
2019.12.01.기준 has 250 (100.0%) missing valuesMissing
◎ 에스컬레이터 역별 대수 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.12.01.기준 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 04:43:16.251889
Analysis finished2023-12-11 04:43:16.749851
Duration0.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

◎ 에스컬레이터 역별 대수
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing240
Missing (%)96.0%
Memory size2.1 KiB
Distinct228
Distinct (%)91.6%
Missing1
Missing (%)0.4%
Memory size2.1 KiB
2023-12-11T13:43:17.060482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length3.2771084
Min length2

Characters and Unicode

Total characters816
Distinct characters207
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)83.1%

Sample

1st row역명
2nd row동대문(1)
3rd row동묘앞
4th row서울(1)
5th row시청(1)
ValueCountFrequency (%)
불광 2
 
0.8%
태릉입구 2
 
0.8%
공덕 2
 
0.8%
노원 2
 
0.8%
왕십리 2
 
0.8%
삼각지 2
 
0.8%
을지로4가 2
 
0.8%
신당 2
 
0.8%
충정로 2
 
0.8%
합정 2
 
0.8%
Other values (218) 229
92.0%
2023-12-11T13:43:17.619360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
3.8%
27
 
3.3%
( 26
 
3.2%
) 26
 
3.2%
22
 
2.7%
21
 
2.6%
15
 
1.8%
15
 
1.8%
14
 
1.7%
13
 
1.6%
Other values (197) 606
74.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 741
90.8%
Open Punctuation 26
 
3.2%
Close Punctuation 26
 
3.2%
Decimal Number 23
 
2.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
4.2%
27
 
3.6%
22
 
3.0%
21
 
2.8%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
13
 
1.8%
Other values (191) 557
75.2%
Decimal Number
ValueCountFrequency (%)
3 7
30.4%
4 7
30.4%
1 5
21.7%
2 4
17.4%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 741
90.8%
Common 75
 
9.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
4.2%
27
 
3.6%
22
 
3.0%
21
 
2.8%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
13
 
1.8%
Other values (191) 557
75.2%
Common
ValueCountFrequency (%)
( 26
34.7%
) 26
34.7%
3 7
 
9.3%
4 7
 
9.3%
1 5
 
6.7%
2 4
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 741
90.8%
ASCII 75
 
9.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
4.2%
27
 
3.6%
22
 
3.0%
21
 
2.8%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
13
 
1.8%
Other values (191) 557
75.2%
ASCII
ValueCountFrequency (%)
( 26
34.7%
) 26
34.7%
3 7
 
9.3%
4 7
 
9.3%
1 5
 
6.7%
2 4
 
5.3%

Unnamed: 2
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.1 KiB

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.4%
Memory size2.1 KiB

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing241
Missing (%)96.4%
Memory size2.1 KiB

2019.12.01.기준
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing250
Missing (%)100.0%
Memory size2.3 KiB

Missing values

2023-12-11T13:43:16.413627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:43:16.551612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T13:43:16.678722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

◎ 에스컬레이터 역별 대수Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 42019.12.01.기준
0호선역명기기대수합(호선별)<NA>
11동대문(1)E/S133<NA>
2NaN동묘앞E/S12NaN<NA>
3NaN서울(1)E/S5NaN<NA>
4NaN시청(1)E/S3NaN<NA>
5NaN제기동E/S2NaN<NA>
6NaN종각E/S2NaN<NA>
7NaN종로3가(1)E/S4NaN<NA>
8NaN청량리(1)E/S4NaN<NA>
92건대입구E/S2228<NA>
◎ 에스컬레이터 역별 대수Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 42019.12.01.기준
240NaN단대오거리E/S4NaN<NA>
241NaN모란E/S2NaN<NA>
242NaN몽촌토성E/S8NaN<NA>
243NaN문정E/S2NaN<NA>
244NaN복정E/S3NaN<NA>
245NaN산성E/S13NaN<NA>
246NaN암사E/S4NaN<NA>
247NaN잠실E/S5NaN<NA>
248NaN장지E/S6NaN<NA>
249총계<NA>1716NaNNaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1# duplicates
0가락시장2
1건대입구2
2고속터미널2
3공덕2
4군자2
5노원2
6대림2
7동묘앞2
8불광2
9삼각지2