Overview

Dataset statistics

Number of variables8
Number of observations233
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)4.3%
Total size in memory14.7 KiB
Average record size in memory64.6 B

Variable types

Categorical7
Text1

Dataset

Description수도권2호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 상하행구분, 출입구번호, 상세위치, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041368/fileData.do

Alerts

철도운영기관 has constant value ""Constant
선명 has constant value ""Constant
Dataset has 10 (4.3%) duplicate rowsDuplicates
상하행구분 is highly overall correlated with 시작층 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 상하행구분 and 1 other fieldsHigh correlation
종료층 is highly overall correlated with 상하행구분 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 18:42:24.745737
Analysis finished2023-12-12 18:42:25.964389
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
서울교통공사
233 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 233
100.0%

Length

2023-12-13T03:42:26.084903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:42:26.236186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 233
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2호선
233 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2호선
2nd row2호선
3rd row2호선
4th row2호선
5th row2호선

Common Values

ValueCountFrequency (%)
2호선 233
100.0%

Length

2023-12-13T03:42:26.409642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:42:26.587021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 233
100.0%

역명
Categorical

Distinct44
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
사당
 
12
용두(동대문구청)
 
12
역삼
 
12
잠실새내
 
11
신도림
 
9
Other values (39)
177 

Length

Max length11
Median length10
Mean length4.3433476
Min length2

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row시청
2nd row시청
3rd row시청
4th row시청
5th row을지로입구

Common Values

ValueCountFrequency (%)
사당 12
 
5.2%
용두(동대문구청) 12
 
5.2%
역삼 12
 
5.2%
잠실새내 11
 
4.7%
신도림 9
 
3.9%
서울대입구(관악구청) 8
 
3.4%
왕십리 8
 
3.4%
상왕십리 8
 
3.4%
뚝섬 8
 
3.4%
홍대입구 8
 
3.4%
Other values (34) 137
58.8%

Length

2023-12-13T03:42:26.768659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사당 12
 
5.2%
역삼 12
 
5.2%
용두(동대문구청 12
 
5.2%
잠실새내 11
 
4.7%
신도림 9
 
3.9%
홍대입구 8
 
3.4%
성수 8
 
3.4%
신대방 8
 
3.4%
봉천 8
 
3.4%
서초 8
 
3.4%
Other values (34) 137
58.8%

상하행구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
상행
124 
하행
109 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하행
2nd row상행
3rd row하행
4th row상행
5th row하행

Common Values

ValueCountFrequency (%)
상행 124
53.2%
하행 109
46.8%

Length

2023-12-13T03:42:26.982987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:42:27.156913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 124
53.2%
하행 109
46.8%

출입구번호
Categorical

Distinct11
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
<NA>
71 
1
32 
2
28 
3
24 
4
24 
Other values (6)
54 

Length

Max length4
Median length1
Mean length1.9570815
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row8

Common Values

ValueCountFrequency (%)
<NA> 71
30.5%
1 32
13.7%
2 28
 
12.0%
3 24
 
10.3%
4 24
 
10.3%
5 18
 
7.7%
8 13
 
5.6%
6 12
 
5.2%
7 5
 
2.1%
1-1 4
 
1.7%

Length

2023-12-13T03:42:27.357540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 71
30.5%
1 32
13.7%
2 28
 
12.0%
3 24
 
10.3%
4 24
 
10.3%
5 18
 
7.7%
8 13
 
5.6%
6 12
 
5.2%
7 5
 
2.1%
1-1 4
 
1.7%
Distinct79
Distinct (%)33.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-13T03:42:27.784375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length13
Mean length12.304721
Min length3

Characters and Unicode

Total characters2867
Distinct characters50
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)18.0%

Sample

1st row(연결통로)서울역방향 1-4 근접 (2)
2nd row(연결통로)서울역방향 1-4 근접 (2)
3rd row(연결통로)내부화장실 근접 (2)
4th row(연결통로)내부화장실 근접 (2)
5th row(F1)8번 출입구 방면
ValueCountFrequency (%)
출입구 174
27.7%
방면 174
27.7%
f1)1번 15
 
2.4%
f1)2번 13
 
2.1%
f1)4번 12
 
1.9%
근접 12
 
1.9%
b1)1번 11
 
1.8%
f1)3번 11
 
1.8%
b3)외선 10
 
1.6%
f1)5번 9
 
1.4%
Other values (76) 187
29.8%
2023-12-13T03:42:28.470867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
436
15.2%
( 238
 
8.3%
) 237
 
8.3%
1 215
 
7.5%
178
 
6.2%
174
 
6.1%
174
 
6.1%
174
 
6.1%
174
 
6.1%
174
 
6.1%
Other values (40) 693
24.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1217
42.4%
Decimal Number 480
 
16.7%
Space Separator 436
 
15.2%
Open Punctuation 238
 
8.3%
Close Punctuation 237
 
8.3%
Uppercase Letter 221
 
7.7%
Dash Punctuation 32
 
1.1%
Math Symbol 6
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
178
14.6%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
26
 
2.1%
18
 
1.5%
12
 
1.0%
12
 
1.0%
Other values (23) 101
8.3%
Decimal Number
ValueCountFrequency (%)
1 215
44.8%
2 96
20.0%
3 61
 
12.7%
4 36
 
7.5%
5 25
 
5.2%
8 18
 
3.8%
6 13
 
2.7%
7 12
 
2.5%
9 3
 
0.6%
0 1
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
B 111
50.2%
F 110
49.8%
Space Separator
ValueCountFrequency (%)
436
100.0%
Open Punctuation
ValueCountFrequency (%)
( 238
100.0%
Close Punctuation
ValueCountFrequency (%)
) 237
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1429
49.8%
Hangul 1217
42.4%
Latin 221
 
7.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
178
14.6%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
26
 
2.1%
18
 
1.5%
12
 
1.0%
12
 
1.0%
Other values (23) 101
8.3%
Common
ValueCountFrequency (%)
436
30.5%
( 238
16.7%
) 237
16.6%
1 215
15.0%
2 96
 
6.7%
3 61
 
4.3%
4 36
 
2.5%
- 32
 
2.2%
5 25
 
1.7%
8 18
 
1.3%
Other values (5) 35
 
2.4%
Latin
ValueCountFrequency (%)
B 111
50.2%
F 110
49.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1650
57.6%
Hangul 1217
42.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
436
26.4%
( 238
14.4%
) 237
14.4%
1 215
13.0%
B 111
 
6.7%
F 110
 
6.7%
2 96
 
5.8%
3 61
 
3.7%
4 36
 
2.2%
- 32
 
1.9%
Other values (7) 78
 
4.7%
Hangul
ValueCountFrequency (%)
178
14.6%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
174
14.3%
26
 
2.1%
18
 
1.5%
12
 
1.0%
12
 
1.0%
Other values (23) 101
8.3%

시작층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
지상1
80 
지하1
68 
지하2
28 
지상2
25 
지하3
15 
Other values (2)
17 

Length

Max length3
Median length3
Mean length2.9484979
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지상1

Common Values

ValueCountFrequency (%)
지상1 80
34.3%
지하1 68
29.2%
지하2 28
 
12.0%
지상2 25
 
10.7%
지하3 15
 
6.4%
지하 12
 
5.2%
지상3 5
 
2.1%

Length

2023-12-13T03:42:28.671925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:42:28.855195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상1 80
34.3%
지하1 68
29.2%
지하2 28
 
12.0%
지상2 25
 
10.7%
지하3 15
 
6.4%
지하 12
 
5.2%
지상3 5
 
2.1%

종료층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
지상1
73 
지하1
64 
지하2
46 
지상3
20 
지하
12 
Other values (2)
18 

Length

Max length3
Median length3
Mean length2.9484979
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하1

Common Values

ValueCountFrequency (%)
지상1 73
31.3%
지하1 64
27.5%
지하2 46
19.7%
지상3 20
 
8.6%
지하 12
 
5.2%
지상2 10
 
4.3%
지하3 8
 
3.4%

Length

2023-12-13T03:42:29.050941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:42:29.203302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상1 73
31.3%
지하1 64
27.5%
지하2 46
19.7%
지상3 20
 
8.6%
지하 12
 
5.2%
지상2 10
 
4.3%
지하3 8
 
3.4%

Correlations

2023-12-13T03:42:29.331655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호상세위치시작층종료층
역명1.0000.0000.8590.8110.8450.841
상하행구분0.0001.0000.0000.9220.6810.637
출입구번호0.8590.0001.0001.0000.0910.000
상세위치0.8110.9221.0001.0001.0000.972
시작층0.8450.6810.0911.0001.0000.975
종료층0.8410.6370.0000.9720.9751.000
2023-12-13T03:42:29.497297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종료층상하행구분출입구번호시작층역명
종료층1.0000.6790.0000.7520.490
상하행구분0.6791.0000.0000.7280.000
출입구번호0.0000.0001.0000.0430.462
시작층0.7520.7280.0431.0000.496
역명0.4900.0000.4620.4961.000
2023-12-13T03:42:29.660592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호시작층종료층
역명1.0000.0000.4620.4960.490
상하행구분0.0001.0000.0000.7280.679
출입구번호0.4620.0001.0000.0430.000
시작층0.4960.7280.0431.0000.752
종료층0.4900.6790.0000.7521.000

Missing values

2023-12-13T03:42:25.628480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:42:25.872153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
0서울교통공사2호선시청하행<NA>(연결통로)서울역방향 1-4 근접 (2)지하지하
1서울교통공사2호선시청상행<NA>(연결통로)서울역방향 1-4 근접 (2)지하지하
2서울교통공사2호선시청하행<NA>(연결통로)내부화장실 근접 (2)지하지하
3서울교통공사2호선시청상행<NA>(연결통로)내부화장실 근접 (2)지하지하
4서울교통공사2호선을지로입구하행8(F1)8번 출입구 방면지상1지하1
5서울교통공사2호선을지로입구상행8(B1)8번 출입구 방면지하1지상1
6서울교통공사2호선을지로입구하행8(F1)1번 출입구 방면지상1지하1
7서울교통공사2호선을지로입구상행8(연결통로)1번 출입구 방면지하지하
8서울교통공사2호선을지로입구하행8(연결통로)1번 출입구 방면지하지하
9서울교통공사2호선을지로입구상행8(B1)1번 출입구 방면지하1지상1
철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
223서울교통공사2호선이대상행<NA>(B3)외선 2-2지하3지하2
224서울교통공사2호선이대하행<NA>(B2)지하2지하3
225서울교통공사2호선이대상행<NA>(B3)외선 9-3지하3지하2
226서울교통공사2호선아현하행1(F1)1번 출입구 방면지상1지하1
227서울교통공사2호선아현상행1(B1)1번 출입구 방면지하1지상1
228서울교통공사2호선충정로(경기대입구)하행<NA>(B2)지하2지하3
229서울교통공사2호선충정로(경기대입구)상행<NA>(B3)외선 4-3지하3지하2
230서울교통공사2호선충정로(경기대입구)하행<NA>(B1)대합실지하1지하2
231서울교통공사2호선충정로(경기대입구)상행<NA>(B2)대합실지하2지하1
232서울교통공사2호선신도림하행<NA>(B2)지하2지하3

Duplicate rows

Most frequently occurring

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층# duplicates
0서울교통공사2호선사당하행<NA>(B1)지하1지하24
1서울교통공사2호선신대방하행<NA>(F3)지상3지상24
2서울교통공사2호선신도림상행<NA>(B3)외선지하3지하23
7서울교통공사2호선왕십리상행<NA>(환승통로)지하지하3
8서울교통공사2호선왕십리하행<NA>(환승통로)지하지하3
3서울교통공사2호선신도림상행<NA>(B3)외선 7-3지하3지하22
4서울교통공사2호선신도림하행<NA>(B2)지하2지하32
5서울교통공사2호선역삼상행<NA>(B1)125~8번 출입구 방면 근접지하1지상12
6서울교통공사2호선역삼하행<NA>(F1)125~8번 출입구 방면 근접지상1지하12
9서울교통공사2호선이대하행<NA>(B2)지하2지하32