Overview

Dataset statistics

Number of variables6
Number of observations276
Missing cells811
Missing cells (%)49.0%
Duplicate rows22
Duplicate rows (%)8.0%
Total size in memory13.3 KiB
Average record size in memory49.5 B

Variable types

Unsupported5
Text1

Dataset

Description파일 다운로드
Author서울 교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13241/F/1/datasetView.do

Alerts

Dataset has 22 (8.0%) duplicate rowsDuplicates
◎ 엘리베이터 역별 대수 has 266 (96.4%) missing valuesMissing
Unnamed: 4 has 267 (96.7%) missing valuesMissing
2019.12.01.기준 has 276 (100.0%) missing valuesMissing
◎ 엘리베이터 역별 대수 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.12.01.기준 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 04:00:51.434898
Analysis finished2023-12-11 04:00:51.908719
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

◎ 엘리베이터 역별 대수
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing266
Missing (%)96.4%
Memory size2.3 KiB
Distinct253
Distinct (%)92.0%
Missing1
Missing (%)0.4%
Memory size2.3 KiB
2023-12-11T13:00:52.206346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length3.16
Min length2

Characters and Unicode

Total characters869
Distinct characters215
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique231 ?
Unique (%)84.0%

Sample

1st row역명
2nd row동대문(1)
3rd row동묘앞
4th row서울(1)
5th row시청(1)
ValueCountFrequency (%)
건대입구 2
 
0.7%
천호 2
 
0.7%
잠실 2
 
0.7%
노원 2
 
0.7%
고속터미널 2
 
0.7%
충정로 2
 
0.7%
을지로4가 2
 
0.7%
대림 2
 
0.7%
가락시장 2
 
0.7%
공덕 2
 
0.7%
Other values (243) 255
92.7%
2023-12-11T13:00:52.819515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32
 
3.7%
27
 
3.1%
25
 
2.9%
24
 
2.8%
( 20
 
2.3%
) 20
 
2.3%
19
 
2.2%
16
 
1.8%
15
 
1.7%
14
 
1.6%
Other values (205) 657
75.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 801
92.2%
Decimal Number 28
 
3.2%
Open Punctuation 20
 
2.3%
Close Punctuation 20
 
2.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
4.0%
27
 
3.4%
25
 
3.1%
24
 
3.0%
19
 
2.4%
16
 
2.0%
15
 
1.9%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (198) 601
75.0%
Decimal Number
ValueCountFrequency (%)
3 8
28.6%
4 7
25.0%
2 6
21.4%
1 6
21.4%
5 1
 
3.6%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 801
92.2%
Common 68
 
7.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
4.0%
27
 
3.4%
25
 
3.1%
24
 
3.0%
19
 
2.4%
16
 
2.0%
15
 
1.9%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (198) 601
75.0%
Common
ValueCountFrequency (%)
( 20
29.4%
) 20
29.4%
3 8
 
11.8%
4 7
 
10.3%
2 6
 
8.8%
1 6
 
8.8%
5 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 801
92.2%
ASCII 68
 
7.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
32
 
4.0%
27
 
3.4%
25
 
3.1%
24
 
3.0%
19
 
2.4%
16
 
2.0%
15
 
1.9%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (198) 601
75.0%
ASCII
ValueCountFrequency (%)
( 20
29.4%
) 20
29.4%
3 8
 
11.8%
4 7
 
10.3%
2 6
 
8.8%
1 6
 
8.8%
5 1
 
1.5%

Unnamed: 2
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.3 KiB

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.4%
Memory size2.3 KiB

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing267
Missing (%)96.7%
Memory size2.3 KiB

2019.12.01.기준
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing276
Missing (%)100.0%
Memory size2.6 KiB

Missing values

2023-12-11T13:00:51.600839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:00:51.717174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T13:00:51.824281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

◎ 엘리베이터 역별 대수Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 42019.12.01.기준
0호선역명기기대수합(호선별)<NA>
11동대문(1)E/L336<NA>
2NaN동묘앞E/L7NaN<NA>
3NaN서울(1)E/L4NaN<NA>
4NaN시청(1)E/L3NaN<NA>
5NaN신설동(1)E/L5NaN<NA>
6NaN제기동E/L3NaN<NA>
7NaN종각E/L4NaN<NA>
8NaN종로3가(1)E/L3NaN<NA>
9NaN종로5가E/L3NaN<NA>
◎ 엘리베이터 역별 대수Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 42019.12.01.기준
266NaN산성E/L4NaN<NA>
267NaN석촌E/L3NaN<NA>
268NaN송파E/L3NaN<NA>
269NaN수진E/L3NaN<NA>
270NaN신흥E/L2NaN<NA>
271NaN암사E/L3NaN<NA>
272NaN잠실E/L3NaN<NA>
273NaN장지E/L4NaN<NA>
274NaN천호E/L2NaN<NA>
275총계<NA>813NaNNaN<NA>

Duplicate rows

Most frequently occurring

Unnamed: 1# duplicates
0가락시장2
1건대입구2
2고속터미널2
3공덕2
4군자2
5노원2
6대림2
7동묘앞2
8불광2
9삼각지2