Overview

Dataset statistics

Number of variables8
Number of observations457
Missing cells0
Missing cells (%)0.0%
Duplicate rows64
Duplicate rows (%)14.0%
Total size in memory28.7 KiB
Average record size in memory64.3 B

Variable types

Categorical6
Text2

Dataset

Description수도권5호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 상하행구분, 출입구번호, 상세위치, 시작역층, 종료역층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041371/fileData.do

Alerts

철도운영기관 has constant value ""Constant
선명 has constant value ""Constant
Dataset has 64 (14.0%) duplicate rowsDuplicates
상하행구분 is highly overall correlated with 시작층 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 상하행구분High correlation
종료층 is highly overall correlated with 상하행구분High correlation

Reproduction

Analysis started2023-12-12 19:41:10.822669
Analysis finished2023-12-12 19:41:11.436752
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
서울교통공사
457 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 457
100.0%

Length

2023-12-13T04:41:11.506877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:41:11.618783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 457
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
5호선
457 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row5호선
3rd row5호선
4th row5호선
5th row5호선

Common Values

ValueCountFrequency (%)
5호선 457
100.0%

Length

2023-12-13T04:41:11.737958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:41:11.847746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5호선 457
100.0%

역명
Text

Distinct52
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-12-13T04:41:12.117813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length4.7877462
Min length2

Characters and Unicode

Total characters2188
Distinct characters110
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강동
2nd row강동
3rd row강동
4th row강동
5th row강동
ValueCountFrequency (%)
하남시청(덕풍·신장 28
 
6.1%
하남풍산 24
 
5.3%
미사 22
 
4.8%
하남검단산역 20
 
4.4%
천호(풍납토성 16
 
3.5%
상일동 16
 
3.5%
강동 14
 
3.1%
강일 14
 
3.1%
영등포구청 14
 
3.1%
오목교(목동운동장앞 12
 
2.6%
Other values (42) 277
60.6%
2023-12-13T04:41:12.589385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 112
 
5.1%
) 112
 
5.1%
80
 
3.7%
72
 
3.3%
72
 
3.3%
68
 
3.1%
68
 
3.1%
64
 
2.9%
63
 
2.9%
48
 
2.2%
Other values (100) 1429
65.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1915
87.5%
Open Punctuation 112
 
5.1%
Close Punctuation 112
 
5.1%
Other Punctuation 28
 
1.3%
Decimal Number 21
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
4.2%
72
 
3.8%
72
 
3.8%
68
 
3.6%
68
 
3.6%
64
 
3.3%
63
 
3.3%
48
 
2.5%
45
 
2.3%
42
 
2.2%
Other values (95) 1293
67.5%
Decimal Number
ValueCountFrequency (%)
4 11
52.4%
3 10
47.6%
Open Punctuation
ValueCountFrequency (%)
( 112
100.0%
Close Punctuation
ValueCountFrequency (%)
) 112
100.0%
Other Punctuation
ValueCountFrequency (%)
· 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1915
87.5%
Common 273
 
12.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
4.2%
72
 
3.8%
72
 
3.8%
68
 
3.6%
68
 
3.6%
64
 
3.3%
63
 
3.3%
48
 
2.5%
45
 
2.3%
42
 
2.2%
Other values (95) 1293
67.5%
Common
ValueCountFrequency (%)
( 112
41.0%
) 112
41.0%
· 28
 
10.3%
4 11
 
4.0%
3 10
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1915
87.5%
ASCII 245
 
11.2%
None 28
 
1.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 112
45.7%
) 112
45.7%
4 11
 
4.5%
3 10
 
4.1%
Hangul
ValueCountFrequency (%)
80
 
4.2%
72
 
3.8%
72
 
3.8%
68
 
3.6%
68
 
3.6%
64
 
3.3%
63
 
3.3%
48
 
2.5%
45
 
2.3%
42
 
2.2%
Other values (95) 1293
67.5%
None
ValueCountFrequency (%)
· 28
100.0%

상하행구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
상행
239 
하행
218 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하행
2nd row상행
3rd row상행
4th row하행
5th row하행

Common Values

ValueCountFrequency (%)
상행 239
52.3%
하행 218
47.7%

Length

2023-12-13T04:41:12.770782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:41:12.898057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 239
52.3%
하행 218
47.7%

출입구번호
Categorical

Distinct13
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
<NA>
224 
4
49 
1
43 
3
39 
2
28 
Other values (8)
74 

Length

Max length4
Median length3
Mean length2.5098468
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 224
49.0%
4 49
 
10.7%
1 43
 
9.4%
3 39
 
8.5%
2 28
 
6.1%
5 22
 
4.8%
6 16
 
3.5%
7 15
 
3.3%
1/2 6
 
1.3%
8 6
 
1.3%
Other values (3) 9
 
2.0%

Length

2023-12-13T04:41:13.056700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 224
49.0%
4 49
 
10.7%
1 43
 
9.4%
3 39
 
8.5%
2 28
 
6.1%
5 22
 
4.8%
6 16
 
3.5%
7 15
 
3.3%
1/2 6
 
1.3%
8 6
 
1.3%
Other values (3) 9
 
2.0%
Distinct124
Distinct (%)27.1%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
2023-12-13T04:41:13.461278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length15
Mean length8.297593
Min length4

Characters and Unicode

Total characters3792
Distinct characters50
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)17.9%

Sample

1st row(B1)ES-1
2nd row(B3)ES-2
3rd row(B3)ES-3
4th row(B1)ES-4
5th row(B3)ES-5
ValueCountFrequency (%)
b1)b1대합실 68
 
13.7%
b1 58
 
11.7%
b2)b2대합실 37
 
7.5%
b2 29
 
5.9%
b3)b3승강장 18
 
3.6%
출입구방면 17
 
3.4%
b3 14
 
2.8%
b4 14
 
2.8%
f1)4번출입구 12
 
2.4%
f1)3번출입구 12
 
2.4%
Other values (103) 216
43.6%
2023-12-13T04:41:14.052449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 538
14.2%
) 483
12.7%
( 483
12.7%
1 387
 
10.2%
2 181
 
4.8%
126
 
3.3%
126
 
3.3%
126
 
3.3%
3 123
 
3.2%
123
 
3.2%
Other values (40) 1096
28.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1144
30.2%
Decimal Number 841
22.2%
Uppercase Letter 715
18.9%
Close Punctuation 483
12.7%
Open Punctuation 483
12.7%
Dash Punctuation 68
 
1.8%
Space Separator 55
 
1.5%
Math Symbol 2
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
126
11.0%
126
11.0%
126
11.0%
123
10.8%
120
10.5%
120
10.5%
116
10.1%
58
 
5.1%
40
 
3.5%
40
 
3.5%
Other values (17) 149
13.0%
Decimal Number
ValueCountFrequency (%)
1 387
46.0%
2 181
21.5%
3 123
 
14.6%
4 73
 
8.7%
5 42
 
5.0%
6 9
 
1.1%
7 8
 
1.0%
8 7
 
0.8%
9 6
 
0.7%
0 5
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
B 538
75.2%
F 102
 
14.3%
M 25
 
3.5%
S 24
 
3.4%
E 24
 
3.4%
D 1
 
0.1%
W 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 483
100.0%
Open Punctuation
ValueCountFrequency (%)
( 483
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 68
100.0%
Space Separator
ValueCountFrequency (%)
55
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1933
51.0%
Hangul 1144
30.2%
Latin 715
 
18.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
126
11.0%
126
11.0%
126
11.0%
123
10.8%
120
10.5%
120
10.5%
116
10.1%
58
 
5.1%
40
 
3.5%
40
 
3.5%
Other values (17) 149
13.0%
Common
ValueCountFrequency (%)
) 483
25.0%
( 483
25.0%
1 387
20.0%
2 181
 
9.4%
3 123
 
6.4%
4 73
 
3.8%
- 68
 
3.5%
55
 
2.8%
5 42
 
2.2%
6 9
 
0.5%
Other values (6) 29
 
1.5%
Latin
ValueCountFrequency (%)
B 538
75.2%
F 102
 
14.3%
M 25
 
3.5%
S 24
 
3.4%
E 24
 
3.4%
D 1
 
0.1%
W 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2648
69.8%
Hangul 1144
30.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 538
20.3%
) 483
18.2%
( 483
18.2%
1 387
14.6%
2 181
 
6.8%
3 123
 
4.6%
F 102
 
3.9%
4 73
 
2.8%
- 68
 
2.6%
55
 
2.1%
Other values (13) 155
 
5.9%
Hangul
ValueCountFrequency (%)
126
11.0%
126
11.0%
126
11.0%
123
10.8%
120
10.5%
120
10.5%
116
10.1%
58
 
5.1%
40
 
3.5%
40
 
3.5%
Other values (17) 149
13.0%

시작층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
지하1
177 
지상1
102 
지하2
81 
지하3
56 
지하4
29 
Other values (2)
 
12

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row지하1
2nd row지하3
3rd row지하3
4th row지하1
5th row지하3

Common Values

ValueCountFrequency (%)
지하1 177
38.7%
지상1 102
22.3%
지하2 81
17.7%
지하3 56
 
12.3%
지하4 29
 
6.3%
지하5 11
 
2.4%
지하8 1
 
0.2%

Length

2023-12-13T04:41:14.234363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:41:14.376639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 177
38.7%
지상1 102
22.3%
지하2 81
17.7%
지하3 56
 
12.3%
지하4 29
 
6.3%
지하5 11
 
2.4%
지하8 1
 
0.2%

종료층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
지하1
177 
지상1
108 
지하2
80 
지하3
56 
지하4
25 
Other values (2)
 
11

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row지하3
2nd row지하1
3rd row지하1
4th row지하3
5th row지하4

Common Values

ValueCountFrequency (%)
지하1 177
38.7%
지상1 108
23.6%
지하2 80
17.5%
지하3 56
 
12.3%
지하4 25
 
5.5%
지하5 10
 
2.2%
지하8 1
 
0.2%

Length

2023-12-13T04:41:14.501945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:41:14.615883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 177
38.7%
지상1 108
23.6%
지하2 80
17.5%
지하3 56
 
12.3%
지하4 25
 
5.5%
지하5 10
 
2.2%
지하8 1
 
0.2%

Correlations

2023-12-13T04:41:14.713352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호시작층종료층
역명1.0000.0000.6700.5960.564
상하행구분0.0001.0000.0000.5560.525
출입구번호0.6700.0001.0000.0000.000
시작층0.5960.5560.0001.0000.811
종료층0.5640.5250.0000.8111.000
2023-12-13T04:41:14.840714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출입구번호종료층상하행구분시작층
출입구번호1.0000.0000.0000.000
종료층0.0001.0000.5610.409
상하행구분0.0000.5611.0000.594
시작층0.0000.4090.5941.000
2023-12-13T04:41:14.957073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상하행구분출입구번호시작층종료층
상하행구분1.0000.0000.5940.561
출입구번호0.0001.0000.0000.000
시작층0.5940.0001.0000.409
종료층0.5610.0000.4091.000

Missing values

2023-12-13T04:41:11.245754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:41:11.374367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
0서울교통공사5호선강동하행<NA>(B1)ES-1지하1지하3
1서울교통공사5호선강동상행<NA>(B3)ES-2지하3지하1
2서울교통공사5호선강동상행<NA>(B3)ES-3지하3지하1
3서울교통공사5호선강동하행<NA>(B1)ES-4지하1지하3
4서울교통공사5호선강동하행<NA>(B3)ES-5지하3지하4
5서울교통공사5호선강동상행<NA>(B4)ES-6지하4지하3
6서울교통공사5호선강동하행<NA>(B3)ES-7지하3지하4
7서울교통공사5호선강동상행<NA>(B4)ES-8지하4지하3
8서울교통공사5호선강동상행4(B1)ES-11지하1지상1
9서울교통공사5호선강동하행4(F1)4번출입구지상1지하1
철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
447서울교통공사5호선행당상행1(B1)지하1지상1
448서울교통공사5호선행당하행1(F1)1번출입구-2지상1지하1
449서울교통공사5호선행당상행<NA>(B4)지하4지하3
450서울교통공사5호선행당하행<NA>(B3)지하3지하4
451서울교통공사5호선행당상행4(B1)지하1지상1
452서울교통공사5호선행당하행4(F1)4번출입구-2지상1지하1
453서울교통공사5호선화곡상행1/2(B1)B1대합실지하1지하1
454서울교통공사5호선화곡하행1/2(BM)12번 출입구방면지하1지하1
455서울교통공사5호선화곡상행3(BM)3번 출입구방면지하1지상1
456서울교통공사5호선화곡하행3(F1)3번출입구지상1지하1

Duplicate rows

Most frequently occurring

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층# duplicates
46서울교통공사5호선왕십리상행<NA>(B4)지하4지하25
0서울교통공사5호선강일상행<NA>(B3)B3승강장지하3지하24
1서울교통공사5호선강일하행<NA>(B2)B2대합실지하2지하34
6서울교통공사5호선김포공항상행<NA>(B3)B3승강장지하3지하24
7서울교통공사5호선김포공항하행<NA>(B2)B2대합실지하2지하34
8서울교통공사5호선까치산상행<NA>(B5)B5승강장지하5지하14
9서울교통공사5호선까치산하행<NA>(B1)B1대합실지하1지하54
55서울교통공사5호선하남검단산역상행<NA>(B3)B3승강장지하3지하24
57서울교통공사5호선하남검단산역하행<NA>(B2)B2대합실지하2지하34
59서울교통공사5호선하남시청(덕풍·신장)상행<NA>(B2)B2승강장지하2지하14