Overview

Dataset statistics

Number of variables8
Number of observations113
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.9%
Total size in memory7.3 KiB
Average record size in memory66.2 B

Variable types

Categorical6
Text2

Dataset

Description수도권 1호선의 도시광역철도역들의 철도운영기관명, 선명, 역명,지상지하구분, 화장실의 역층, 게이트내외, 출구번호, 상세위치 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041254/fileData.do

Alerts

선명 has constant value ""Constant
Dataset has 1 (0.9%) duplicate rowsDuplicates
철도운영기관명 is highly overall correlated with 지상지하구분High correlation
지상지하구분 is highly overall correlated with 철도운영기관명High correlation

Reproduction

Analysis started2023-12-12 11:41:46.093810
Analysis finished2023-12-12 11:41:47.094485
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
코레일
99 
서울교통공사
14 

Length

Max length6
Median length3
Mean length3.3716814
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
코레일 99
87.6%
서울교통공사 14
 
12.4%

Length

2023-12-12T20:41:47.198764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:41:47.334496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 99
87.6%
서울교통공사 14
 
12.4%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
1호선
113 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 113
100.0%

Length

2023-12-12T20:41:47.541066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:41:47.677675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 113
100.0%

역명
Text

Distinct87
Distinct (%)77.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T20:41:48.075220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length2.5840708
Min length2

Characters and Unicode

Total characters292
Distinct characters113
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)60.2%

Sample

1st row청량리(서울시립대입구)
2nd row제기동
3rd row신설동
4th row동묘앞
5th row동묘앞
ValueCountFrequency (%)
금정 4
 
3.5%
동묘앞 4
 
3.5%
용산 4
 
3.5%
부평 3
 
2.7%
관악 2
 
1.8%
수원 2
 
1.8%
도봉 2
 
1.8%
망월사 2
 
1.8%
의정부 2
 
1.8%
세마 2
 
1.8%
Other values (77) 86
76.1%
2023-12-12T20:41:48.745430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
4.5%
12
 
4.1%
10
 
3.4%
10
 
3.4%
8
 
2.7%
7
 
2.4%
6
 
2.1%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (103) 211
72.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 283
96.9%
Open Punctuation 3
 
1.0%
Close Punctuation 3
 
1.0%
Decimal Number 3
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
4.6%
12
 
4.2%
10
 
3.5%
10
 
3.5%
8
 
2.8%
7
 
2.5%
6
 
2.1%
5
 
1.8%
5
 
1.8%
5
 
1.8%
Other values (99) 202
71.4%
Decimal Number
ValueCountFrequency (%)
3 2
66.7%
5 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 283
96.9%
Common 9
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
4.6%
12
 
4.2%
10
 
3.5%
10
 
3.5%
8
 
2.8%
7
 
2.5%
6
 
2.1%
5
 
1.8%
5
 
1.8%
5
 
1.8%
Other values (99) 202
71.4%
Common
ValueCountFrequency (%)
( 3
33.3%
) 3
33.3%
3 2
22.2%
5 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 283
96.9%
ASCII 9
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
13
 
4.6%
12
 
4.2%
10
 
3.5%
10
 
3.5%
8
 
2.8%
7
 
2.5%
6
 
2.1%
5
 
1.8%
5
 
1.8%
5
 
1.8%
Other values (99) 202
71.4%
ASCII
ValueCountFrequency (%)
( 3
33.3%
) 3
33.3%
3 2
22.2%
5 1
 
11.1%

지상지하구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
지상
96 
지하
17 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지상 96
85.0%
지하 17
 
15.0%

Length

2023-12-12T20:41:48.968493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:41:49.151191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상 96
85.0%
지하 17
 
15.0%

역층
Categorical

Distinct5
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
1
48 
2
45 
3
18 
4
 
1
<NA>
 
1

Length

Max length4
Median length1
Mean length1.0265487
Min length1

Unique

Unique2 ?
Unique (%)1.8%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 48
42.5%
2 45
39.8%
3 18
 
15.9%
4 1
 
0.9%
<NA> 1
 
0.9%

Length

2023-12-12T20:41:49.340059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:41:49.542597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 48
42.5%
2 45
39.8%
3 18
 
15.9%
4 1
 
0.9%
na 1
 
0.9%

게이트내외
Categorical

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
86 
27 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
86
76.1%
27
 
23.9%

Length

2023-12-12T20:41:49.745125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:41:49.996737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
86
76.1%
27
 
23.9%

출구번호
Categorical

Distinct20
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
1
54 
2
16 
3
<NA>
4
 
5
Other values (15)
22 

Length

Max length13
Median length1
Mean length1.8584071
Min length1

Unique

Unique12 ?
Unique (%)10.6%

Sample

1st row4
2nd row2
3rd row1
4th row3
5th row2

Common Values

ValueCountFrequency (%)
1 54
47.8%
2 16
 
14.2%
3 9
 
8.0%
<NA> 7
 
6.2%
4 5
 
4.4%
2 /3 4
 
3.5%
1 /2 4
 
3.5%
7 2
 
1.8%
8 1
 
0.9%
1 /2 /3 /4 /5 1
 
0.9%
Other values (10) 10
 
8.8%

Length

2023-12-12T20:41:50.203565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 64
46.4%
2 27
19.6%
3 16
 
11.6%
4 9
 
6.5%
na 7
 
5.1%
7 4
 
2.9%
5 4
 
2.9%
8 3
 
2.2%
6 3
 
2.2%
11 1
 
0.7%
Distinct111
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T20:41:50.649595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length23
Mean length15.628319
Min length6

Characters and Unicode

Total characters1766
Distinct characters148
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)96.5%

Sample

1st row4번출구 아래
2nd row2번출구우측
3rd row대합실(지하1층) 역무실 앞
4th row상선승강장 4-3지점
5th row하선승강장 7-3지점
ValueCountFrequency (%)
방향 35
 
7.4%
33
 
7.0%
출입구 29
 
6.1%
맞이방 24
 
5.1%
2f 17
 
3.6%
1번 15
 
3.2%
게이트 12
 
2.5%
11
 
2.3%
1f 9
 
1.9%
1층 9
 
1.9%
Other values (148) 278
58.9%
2023-12-12T20:41:51.478095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
393
22.3%
1 68
 
3.9%
( 67
 
3.8%
) 67
 
3.8%
67
 
3.8%
65
 
3.7%
58
 
3.3%
57
 
3.2%
53
 
3.0%
2 50
 
2.8%
Other values (138) 821
46.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1007
57.0%
Space Separator 393
 
22.3%
Decimal Number 159
 
9.0%
Open Punctuation 68
 
3.9%
Close Punctuation 68
 
3.9%
Uppercase Letter 51
 
2.9%
Other Punctuation 9
 
0.5%
Dash Punctuation 6
 
0.3%
Lowercase Letter 4
 
0.2%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
67
 
6.7%
65
 
6.5%
58
 
5.8%
57
 
5.7%
53
 
5.3%
39
 
3.9%
37
 
3.7%
34
 
3.4%
32
 
3.2%
32
 
3.2%
Other values (112) 533
52.9%
Decimal Number
ValueCountFrequency (%)
1 68
42.8%
2 50
31.4%
3 28
17.6%
4 7
 
4.4%
8 2
 
1.3%
5 2
 
1.3%
7 2
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
F 45
88.2%
B 2
 
3.9%
L 1
 
2.0%
E 1
 
2.0%
C 1
 
2.0%
U 1
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
25.0%
v 1
25.0%
o 1
25.0%
r 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 67
98.5%
[ 1
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 67
98.5%
] 1
 
1.5%
Other Punctuation
ValueCountFrequency (%)
/ 8
88.9%
. 1
 
11.1%
Space Separator
ValueCountFrequency (%)
393
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1007
57.0%
Common 704
39.9%
Latin 55
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
67
 
6.7%
65
 
6.5%
58
 
5.8%
57
 
5.7%
53
 
5.3%
39
 
3.9%
37
 
3.7%
34
 
3.4%
32
 
3.2%
32
 
3.2%
Other values (112) 533
52.9%
Common
ValueCountFrequency (%)
393
55.8%
1 68
 
9.7%
( 67
 
9.5%
) 67
 
9.5%
2 50
 
7.1%
3 28
 
4.0%
/ 8
 
1.1%
4 7
 
1.0%
- 6
 
0.9%
8 2
 
0.3%
Other values (6) 8
 
1.1%
Latin
ValueCountFrequency (%)
F 45
81.8%
B 2
 
3.6%
e 1
 
1.8%
v 1
 
1.8%
L 1
 
1.8%
E 1
 
1.8%
o 1
 
1.8%
C 1
 
1.8%
U 1
 
1.8%
r 1
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1007
57.0%
ASCII 758
42.9%
Arrows 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
393
51.8%
1 68
 
9.0%
( 67
 
8.8%
) 67
 
8.8%
2 50
 
6.6%
F 45
 
5.9%
3 28
 
3.7%
/ 8
 
1.1%
4 7
 
0.9%
- 6
 
0.8%
Other values (15) 19
 
2.5%
Hangul
ValueCountFrequency (%)
67
 
6.7%
65
 
6.5%
58
 
5.8%
57
 
5.7%
53
 
5.3%
39
 
3.9%
37
 
3.7%
34
 
3.4%
32
 
3.2%
32
 
3.2%
Other values (112) 533
52.9%
Arrows
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T20:41:51.664191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명지상지하구분역층게이트내외출구번호
철도운영기관명1.0001.0000.9400.5950.0000.507
역명1.0001.0000.7870.0000.0000.000
지상지하구분0.9400.7871.0000.5720.0000.552
역층0.5950.0000.5721.0000.1890.000
게이트내외0.0000.0000.0000.1891.0000.000
출구번호0.5070.0000.5520.0000.0001.000
2023-12-12T20:41:51.867385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역층출구번호지상지하구분철도운영기관명게이트내외
역층1.0000.0000.3890.4060.123
출구번호0.0001.0000.4500.4120.000
지상지하구분0.3890.4501.0000.7790.000
철도운영기관명0.4060.4120.7791.0000.000
게이트내외0.1230.0000.0000.0001.000
2023-12-12T20:41:52.043738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명지상지하구분역층게이트내외출구번호
철도운영기관명1.0000.7790.4060.0000.412
지상지하구분0.7791.0000.3890.0000.450
역층0.4060.3891.0000.1230.000
게이트내외0.0000.0000.1231.0000.000
출구번호0.4120.4500.0000.0001.000

Missing values

2023-12-12T20:41:46.808286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:41:47.002698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상지하구분역층게이트내외출구번호상세위치
0서울교통공사1호선청량리(서울시립대입구)지하144번출구 아래
1서울교통공사1호선제기동지하122번출구우측
2서울교통공사1호선신설동지하11대합실(지하1층) 역무실 앞
3서울교통공사1호선동묘앞지하13상선승강장 4-3지점
4서울교통공사1호선동묘앞지하12하선승강장 7-3지점
5서울교통공사1호선동묘앞지하13지하1층 3번출입구 부근
6서울교통공사1호선동묘앞지상121층 역무실 부근
7서울교통공사1호선동대문지하14지하1층 4번출입구 부근
8서울교통공사1호선종로5가지하18지하1층 지하상가 연결통로앞
9서울교통공사1호선종로3가지하111지하1층 라게이트 부근 (역무실앞) [게이트 내]
철도운영기관명선명역명지상지하구분역층게이트내외출구번호상세위치
103코레일1호선구로지상31(3F)1번 출구 게이트 옆
104코레일1호선구로지상31(3F)1번 출구 게이트 옆
105코레일1호선광운대지상11(1층) 1번 출입구 방향 옆
106코레일1호선관악지상22(2F) 역사 대합실
107코레일1호선관악지상1<NA>(1F) 하선 승강장 연결 통로
108코레일1호선개봉지상21(2F)맞이방 역무실 맞은편
109코레일1호선간석지상2<NA>개찰구 안쪽 정면
110코레일1호선가산디지털단지지상21 /7 /8(2F) 게이트 옆
111코레일1호선가능지상111번출구 앞
112코레일1호선덕정지상11맞이방 표 내는 곳 옆

Duplicate rows

Most frequently occurring

철도운영기관명선명역명지상지하구분역층게이트내외출구번호상세위치# duplicates
0코레일1호선구로지상31(3F)1번 출구 게이트 옆2