Overview

Dataset statistics

Number of variables8
Number of observations279
Missing cells0
Missing cells (%)0.0%
Duplicate rows22
Duplicate rows (%)7.9%
Total size in memory17.6 KiB
Average record size in memory64.5 B

Variable types

Categorical7
Text1

Dataset

Description수도권3호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 상하행구분, 출입구번호, 상세위치, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041369/fileData.do

Alerts

선명 has constant value ""Constant
Dataset has 22 (7.9%) duplicate rowsDuplicates
역명 is highly overall correlated with 철도운영기관High correlation
철도운영기관 is highly overall correlated with 역명High correlation
상하행구분 is highly overall correlated with 시작층 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 상하행구분 and 1 other fieldsHigh correlation
종료층 is highly overall correlated with 상하행구분 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 22:35:24.589570
Analysis finished2023-12-12 22:35:25.155912
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
서울교통공사
202 
코레일
77 

Length

Max length6
Median length6
Mean length5.172043
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 202
72.4%
코레일 77
 
27.6%

Length

2023-12-13T07:35:25.238539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:25.366794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 202
72.4%
코레일 77
 
27.6%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
3호선
279 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3호선
2nd row3호선
3rd row3호선
4th row3호선
5th row3호선

Common Values

ValueCountFrequency (%)
3호선 279
100.0%

Length

2023-12-13T07:35:25.499485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:25.596941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3호선 279
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct38
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
고속터미널
24 
가락시장
22 
경찰병원
20 
삼송
 
16
오금
 
15
Other values (33)
182 

Length

Max length12
Median length2
Mean length3.265233
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대곡
2nd row대곡
3rd row대곡
4th row대곡
5th row일원

Common Values

ValueCountFrequency (%)
고속터미널 24
 
8.6%
가락시장 22
 
7.9%
경찰병원 20
 
7.2%
삼송 16
 
5.7%
오금 15
 
5.4%
주엽 12
 
4.3%
수서 12
 
4.3%
마두 12
 
4.3%
안국 10
 
3.6%
경복궁(정부서울청사) 10
 
3.6%
Other values (28) 126
45.2%

Length

2023-12-13T07:35:25.726078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고속터미널 24
 
8.6%
가락시장 22
 
7.9%
경찰병원 20
 
7.2%
삼송 16
 
5.7%
오금 15
 
5.4%
주엽 12
 
4.3%
수서 12
 
4.3%
마두 12
 
4.3%
경복궁(정부서울청사 10
 
3.6%
안국 10
 
3.6%
Other values (28) 126
45.2%

상하행구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
상행
149 
하행
130 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하행
2nd row상행
3rd row하행
4th row상행
5th row하행

Common Values

ValueCountFrequency (%)
상행 149
53.4%
하행 130
46.6%

Length

2023-12-13T07:35:25.886890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:25.982830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 149
53.4%
하행 130
46.6%

출입구번호
Categorical

Distinct10
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
<NA>
147 
6
23 
1
22 
5
21 
3
19 
Other values (5)
47 

Length

Max length4
Median length4
Mean length2.609319
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row3

Common Values

ValueCountFrequency (%)
<NA> 147
52.7%
6 23
 
8.2%
1 22
 
7.9%
5 21
 
7.5%
3 19
 
6.8%
2 17
 
6.1%
4 16
 
5.7%
8 8
 
2.9%
3-1 4
 
1.4%
7 2
 
0.7%

Length

2023-12-13T07:35:26.095708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:26.240093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 147
52.7%
6 23
 
8.2%
1 22
 
7.9%
5 21
 
7.5%
3 19
 
6.8%
2 17
 
6.1%
4 16
 
5.7%
8 8
 
2.9%
3-1 4
 
1.4%
7 2
 
0.7%
Distinct162
Distinct (%)58.1%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-13T07:35:26.573649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length40
Mean length12.62724
Min length3

Characters and Unicode

Total characters3523
Distinct characters97
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)45.5%

Sample

1st row(3F)화정역 방향 승강장 7-4 출입문 앞
2nd row (2F)맞이방 안내부스 앞
3rd row(3F)백석역 방향 승강장 3-4 출입문 앞
4th row(2F)맞이방 편의점 앞
5th row(F1)3번 출입구 방면
ValueCountFrequency (%)
출입구 126
 
16.3%
방면 84
 
10.9%
45
 
5.8%
b2 37
 
4.8%
b1)맞이방 28
 
3.6%
25
 
3.2%
b3 23
 
3.0%
방향 23
 
3.0%
기둥 14
 
1.8%
b4 13
 
1.7%
Other values (151) 353
45.8%
2023-12-13T07:35:27.036972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
536
15.2%
( 318
 
9.0%
) 317
 
9.0%
1 244
 
6.9%
B 228
 
6.5%
159
 
4.5%
144
 
4.1%
140
 
4.0%
134
 
3.8%
2 132
 
3.7%
Other values (87) 1171
33.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1284
36.4%
Decimal Number 611
17.3%
Space Separator 536
15.2%
Uppercase Letter 336
 
9.5%
Open Punctuation 318
 
9.0%
Close Punctuation 317
 
9.0%
Dash Punctuation 80
 
2.3%
Math Symbol 37
 
1.1%
Other Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
159
12.4%
144
11.2%
140
10.9%
134
10.4%
132
10.3%
86
 
6.7%
48
 
3.7%
47
 
3.7%
43
 
3.3%
30
 
2.3%
Other values (66) 321
25.0%
Decimal Number
ValueCountFrequency (%)
1 244
39.9%
2 132
21.6%
3 77
 
12.6%
4 66
 
10.8%
5 29
 
4.7%
6 28
 
4.6%
7 17
 
2.8%
8 8
 
1.3%
9 6
 
1.0%
0 4
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
B 228
67.9%
F 90
 
26.8%
X 14
 
4.2%
E 2
 
0.6%
H 2
 
0.6%
Space Separator
ValueCountFrequency (%)
536
100.0%
Open Punctuation
ValueCountFrequency (%)
( 318
100.0%
Close Punctuation
ValueCountFrequency (%)
) 317
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 80
100.0%
Math Symbol
ValueCountFrequency (%)
> 37
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1903
54.0%
Hangul 1284
36.4%
Latin 336
 
9.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
159
12.4%
144
11.2%
140
10.9%
134
10.4%
132
10.3%
86
 
6.7%
48
 
3.7%
47
 
3.7%
43
 
3.3%
30
 
2.3%
Other values (66) 321
25.0%
Common
ValueCountFrequency (%)
536
28.2%
( 318
16.7%
) 317
16.7%
1 244
12.8%
2 132
 
6.9%
- 80
 
4.2%
3 77
 
4.0%
4 66
 
3.5%
> 37
 
1.9%
5 29
 
1.5%
Other values (6) 67
 
3.5%
Latin
ValueCountFrequency (%)
B 228
67.9%
F 90
 
26.8%
X 14
 
4.2%
E 2
 
0.6%
H 2
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2239
63.6%
Hangul 1284
36.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
536
23.9%
( 318
14.2%
) 317
14.2%
1 244
10.9%
B 228
10.2%
2 132
 
5.9%
F 90
 
4.0%
- 80
 
3.6%
3 77
 
3.4%
4 66
 
2.9%
Other values (11) 151
 
6.7%
Hangul
ValueCountFrequency (%)
159
12.4%
144
11.2%
140
10.9%
134
10.4%
132
10.3%
86
 
6.7%
48
 
3.7%
47
 
3.7%
43
 
3.3%
30
 
2.3%
Other values (66) 321
25.0%

시작층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
지하1
103 
지하2
63 
지상1
61 
지하3
27 
지하4
14 
Other values (2)
11 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상3
2nd row지상2
3rd row지상3
4th row지상2
5th row지상1

Common Values

ValueCountFrequency (%)
지하1 103
36.9%
지하2 63
22.6%
지상1 61
21.9%
지하3 27
 
9.7%
지하4 14
 
5.0%
지상2 7
 
2.5%
지상3 4
 
1.4%

Length

2023-12-13T07:35:27.183691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:27.319926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 103
36.9%
지하2 63
22.6%
지상1 61
21.9%
지하3 27
 
9.7%
지하4 14
 
5.0%
지상2 7
 
2.5%
지상3 4
 
1.4%

종료층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
지하1
88 
지상1
73 
지하2
66 
지하3
30 
지하4
10 
Other values (2)
12 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상2
2nd row지상3
3rd row지상2
4th row지상3
5th row지하1

Common Values

ValueCountFrequency (%)
지하1 88
31.5%
지상1 73
26.2%
지하2 66
23.7%
지하3 30
 
10.8%
지하4 10
 
3.6%
지상2 8
 
2.9%
지상3 4
 
1.4%

Length

2023-12-13T07:35:27.455256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:35:27.567564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 88
31.5%
지상1 73
26.2%
지하2 66
23.7%
지하3 30
 
10.8%
지하4 10
 
3.6%
지상2 8
 
2.9%
지상3 4
 
1.4%

Correlations

2023-12-13T07:35:27.680241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관역명상하행구분출입구번호시작층종료층
철도운영기관1.0001.0000.0000.0000.3750.346
역명1.0001.0000.0000.8530.7740.748
상하행구분0.0000.0001.0000.0000.5250.501
출입구번호0.0000.8530.0001.0000.0000.000
시작층0.3750.7740.5250.0001.0000.938
종료층0.3460.7480.5010.0000.9381.000
2023-12-13T07:35:27.839344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종료층출입구번호상하행구분시작층역명철도운영기관
종료층1.0000.0000.5330.6240.3960.367
출입구번호0.0001.0000.0000.0000.4780.000
상하행구분0.5330.0001.0000.5590.0000.000
시작층0.6240.0000.5591.0000.4240.398
역명0.3960.4780.0000.4241.0000.933
철도운영기관0.3670.0000.0000.3980.9331.000
2023-12-13T07:35:28.296747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관역명상하행구분출입구번호시작층종료층
철도운영기관1.0000.9330.0000.0000.3980.367
역명0.9331.0000.0000.4780.4240.396
상하행구분0.0000.0001.0000.0000.5590.533
출입구번호0.0000.4780.0001.0000.0000.000
시작층0.3980.4240.5590.0001.0000.624
종료층0.3670.3960.5330.0000.6241.000

Missing values

2023-12-13T07:35:24.975806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:35:25.107254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
0코레일3호선대곡하행<NA>(3F)화정역 방향 승강장 7-4 출입문 앞지상3지상2
1코레일3호선대곡상행<NA>(2F)맞이방 안내부스 앞지상2지상3
2코레일3호선대곡하행<NA>(3F)백석역 방향 승강장 3-4 출입문 앞지상3지상2
3코레일3호선대곡상행<NA>(2F)맞이방 편의점 앞지상2지상3
4서울교통공사3호선일원하행3(F1)3번 출입구 방면지상1지하1
5서울교통공사3호선일원상행6(F1)6번 출입구 방면지상1지하1
6서울교통공사3호선일원하행6(B1)6번 출입구 방면지하1지상1
7서울교통공사3호선수서하행1(F1)1번 출입구 방면지상1지하1
8서울교통공사3호선수서상행1(B1)1번 출입구 방면지하1지상1
9서울교통공사3호선수서하행2(F1)2번 출입구 방면지상1지하1
철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
269서울교통공사3호선대치상행4(B1)4번 출입구 방면지하1지상1
270서울교통공사3호선학여울하행<NA>(B2)B3(대)지하2지하3
271서울교통공사3호선학여울상행<NA>(B3)B3(대)지하3지하2
272서울교통공사3호선대청하행6(F1)6번 출입구 방면지상1지하1
273서울교통공사3호선대청상행6(B1)6번 출입구 방면지하1지상1
274서울교통공사3호선대청하행3(F1)3번 출입구 방면지상1지하1
275서울교통공사3호선대청상행3(B1)3번 출입구 방면지하1지상1
276서울교통공사3호선일원하행1(F1)1번 출입구 방면지상1지하1
277서울교통공사3호선일원상행1(B1)1번 출입구 방면지하1지상1
278서울교통공사3호선일원상행2(B1)2번 출입구 방면지하1지상1

Duplicate rows

Most frequently occurring

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층# duplicates
9서울교통공사3호선고속터미널하행<NA>(B2)지하2지하38
4서울교통공사3호선경찰병원하행<NA>(B1)지하1지하26
3서울교통공사3호선가락시장하행<NA>(B2)지하2지하44
12서울교통공사3호선도곡하행<NA>(B2)지하2지하44
20서울교통공사3호선오금하행<NA>(B2)지하2지하33
0서울교통공사3호선가락시장상행<NA>(B2)B2(대)지하2지하12
1서울교통공사3호선가락시장하행<NA>(B1)B2(대)지하1지하22
2서울교통공사3호선가락시장하행<NA>(B2)지하2지하32
5서울교통공사3호선고속터미널상행<NA>(B3) 2-3지하3지하22
6서울교통공사3호선고속터미널상행<NA>(B3) 4-3지하3지하22