Overview

Dataset statistics

Number of variables8
Number of observations174
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)6.9%
Total size in memory11.0 KiB
Average record size in memory64.8 B

Variable types

Categorical7
Text1

Dataset

Description부산3호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 상하행구분, 출입구번호, 상세위치, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041355/fileData.do

Alerts

철도운영기관 has constant value ""Constant
선명 has constant value ""Constant
Dataset has 12 (6.9%) duplicate rowsDuplicates
시작층 is highly overall correlated with 종료층High correlation
종료층 is highly overall correlated with 시작층High correlation

Reproduction

Analysis started2023-12-12 09:08:32.893677
Analysis finished2023-12-12 09:08:33.586755
Duration0.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
부산교통공사
174 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 174
100.0%

Length

2023-12-12T18:08:33.690778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:08:33.830200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 174
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
3호선
174 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3호선
2nd row3호선
3rd row3호선
4th row3호선
5th row3호선

Common Values

ValueCountFrequency (%)
3호선 174
100.0%

Length

2023-12-12T18:08:33.939659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:08:34.044207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3호선 174
100.0%

역명
Categorical

Distinct17
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
연산
20 
만덕
20 
배산
16 
종합운동장
13 
강서구청
12 
Other values (12)
93 

Length

Max length12
Median length10
Mean length4.2241379
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서구청
2nd row강서구청
3rd row강서구청
4th row강서구청
5th row강서구청

Common Values

ValueCountFrequency (%)
연산 20
11.5%
만덕 20
11.5%
배산 16
9.2%
종합운동장 13
 
7.5%
강서구청 12
 
6.9%
망미(병무청) 12
 
6.9%
숙등(부민병원) 12
 
6.9%
물만골 12
 
6.9%
체육공원 10
 
5.7%
덕천(부산과기대) 10
 
5.7%
Other values (7) 37
21.3%

Length

2023-12-12T18:08:34.169533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
연산 20
11.5%
만덕 20
11.5%
배산 16
9.2%
종합운동장 13
 
7.5%
강서구청 12
 
6.9%
망미(병무청 12
 
6.9%
숙등(부민병원 12
 
6.9%
물만골 12
 
6.9%
덕천(부산과기대 10
 
5.7%
체육공원 10
 
5.7%
Other values (7) 37
21.3%

상하행구분
Categorical

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
상행
98 
하행
76 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하행
2nd row상행
3rd row상행
4th row하행
5th row하행

Common Values

ValueCountFrequency (%)
상행 98
56.3%
하행 76
43.7%

Length

2023-12-12T18:08:34.390176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:08:34.536772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 98
56.3%
하행 76
43.7%

출입구번호
Categorical

Distinct18
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
<NA>
97 
2
16 
1
12 
3
 
9
4
 
9
Other values (13)
31 

Length

Max length7
Median length4
Mean length2.9367816
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row2
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 97
55.7%
2 16
 
9.2%
1 12
 
6.9%
3 9
 
5.2%
4 9
 
5.2%
1/3 4
 
2.3%
3~8 4
 
2.3%
7 3
 
1.7%
5/7 2
 
1.1%
5 2
 
1.1%
Other values (8) 16
 
9.2%

Length

2023-12-12T18:08:34.701475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 97
55.7%
2 16
 
9.2%
1 12
 
6.9%
3 9
 
5.2%
4 9
 
5.2%
1/3 4
 
2.3%
3~8 4
 
2.3%
7 3
 
1.7%
6/8 2
 
1.1%
3/4 2
 
1.1%
Other values (8) 16
 
9.2%
Distinct137
Distinct (%)78.7%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-12T18:08:35.139777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length24
Mean length17.091954
Min length2

Characters and Unicode

Total characters2974
Distinct characters130
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)62.6%

Sample

1st row(2F) 1번 출입구 근처
2nd row(1F) 1번 출입구 근처
3rd row(1F) 2번 출입구 근처
4th row(2F) 2번 출입구 근처
5th row(4F) 대합실 고객센터 맞은편
ValueCountFrequency (%)
방향 69
 
8.6%
출입구 52
 
6.5%
39
 
4.9%
대합실 37
 
4.6%
b1 35
 
4.4%
32
 
4.0%
30
 
3.7%
출입문 30
 
3.7%
승강장 23
 
2.9%
1f 22
 
2.7%
Other values (134) 434
54.0%
2023-12-12T18:08:35.801912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
637
21.4%
1 144
 
4.8%
) 138
 
4.6%
( 138
 
4.6%
B 117
 
3.9%
109
 
3.7%
95
 
3.2%
86
 
2.9%
80
 
2.7%
79
 
2.7%
Other values (120) 1351
45.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1424
47.9%
Space Separator 637
21.4%
Decimal Number 393
 
13.2%
Uppercase Letter 163
 
5.5%
Close Punctuation 138
 
4.6%
Open Punctuation 138
 
4.6%
Dash Punctuation 40
 
1.3%
Math Symbol 36
 
1.2%
Other Punctuation 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
109
 
7.7%
95
 
6.7%
86
 
6.0%
80
 
5.6%
79
 
5.5%
75
 
5.3%
54
 
3.8%
49
 
3.4%
46
 
3.2%
42
 
2.9%
Other values (101) 709
49.8%
Decimal Number
ValueCountFrequency (%)
1 144
36.6%
2 69
17.6%
3 56
 
14.2%
4 51
 
13.0%
5 27
 
6.9%
7 20
 
5.1%
6 7
 
1.8%
8 7
 
1.8%
9 6
 
1.5%
0 6
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
B 117
71.8%
F 42
 
25.8%
M 4
 
2.5%
Space Separator
ValueCountFrequency (%)
637
100.0%
Close Punctuation
ValueCountFrequency (%)
) 138
100.0%
Open Punctuation
ValueCountFrequency (%)
( 138
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 40
100.0%
Math Symbol
ValueCountFrequency (%)
> 36
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1424
47.9%
Common 1387
46.6%
Latin 163
 
5.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
109
 
7.7%
95
 
6.7%
86
 
6.0%
80
 
5.6%
79
 
5.5%
75
 
5.3%
54
 
3.8%
49
 
3.4%
46
 
3.2%
42
 
2.9%
Other values (101) 709
49.8%
Common
ValueCountFrequency (%)
637
45.9%
1 144
 
10.4%
) 138
 
9.9%
( 138
 
9.9%
2 69
 
5.0%
3 56
 
4.0%
4 51
 
3.7%
- 40
 
2.9%
> 36
 
2.6%
5 27
 
1.9%
Other values (6) 51
 
3.7%
Latin
ValueCountFrequency (%)
B 117
71.8%
F 42
 
25.8%
M 4
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1550
52.1%
Hangul 1424
47.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
637
41.1%
1 144
 
9.3%
) 138
 
8.9%
( 138
 
8.9%
B 117
 
7.5%
2 69
 
4.5%
3 56
 
3.6%
4 51
 
3.3%
F 42
 
2.7%
- 40
 
2.6%
Other values (9) 118
 
7.6%
Hangul
ValueCountFrequency (%)
109
 
7.7%
95
 
6.7%
86
 
6.0%
80
 
5.6%
79
 
5.5%
75
 
5.3%
54
 
3.8%
49
 
3.4%
46
 
3.2%
42
 
2.9%
Other values (101) 709
49.8%

시작층
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
지하1
44 
지상1
30 
지하2
21 
지하3
19 
지하4
13 
Other values (9)
47 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상2
2nd row지상1
3rd row지상1
4th row지상2
5th row지상4

Common Values

ValueCountFrequency (%)
지하1 44
25.3%
지상1 30
17.2%
지하2 21
12.1%
지하3 19
10.9%
지하4 13
 
7.5%
지하5 12
 
6.9%
지상2 10
 
5.7%
지하7 8
 
4.6%
지상3 5
 
2.9%
지상4 4
 
2.3%
Other values (4) 8
 
4.6%

Length

2023-12-12T18:08:36.007402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지하1 44
25.3%
지상1 30
17.2%
지하2 21
12.1%
지하3 19
10.9%
지하4 13
 
7.5%
지하5 12
 
6.9%
지상2 10
 
5.7%
지하7 8
 
4.6%
지상3 5
 
2.9%
지상4 4
 
2.3%
Other values (4) 8
 
4.6%

종료층
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
지하1
41 
지상1
35 
지하2
24 
지상2
16 
지하4
13 
Other values (9)
45 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상1
2nd row지상2
3rd row지상2
4th row지상1
5th row지상2

Common Values

ValueCountFrequency (%)
지하1 41
23.6%
지상1 35
20.1%
지하2 24
13.8%
지상2 16
 
9.2%
지하4 13
 
7.5%
지하3 9
 
5.2%
지상4 8
 
4.6%
지하5 8
 
4.6%
지하7 8
 
4.6%
지상3 4
 
2.3%
Other values (4) 8
 
4.6%

Length

2023-12-12T18:08:36.170301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지하1 41
23.6%
지상1 35
20.1%
지하2 24
13.8%
지상2 16
 
9.2%
지하4 13
 
7.5%
지하3 9
 
5.2%
지상4 8
 
4.6%
지하5 8
 
4.6%
지하7 8
 
4.6%
지상3 4
 
2.3%
Other values (4) 8
 
4.6%

Correlations

2023-12-12T18:08:36.272966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호시작층종료층
역명1.0000.0000.8610.7780.757
상하행구분0.0001.0000.0000.1620.293
출입구번호0.8610.0001.0000.7170.740
시작층0.7780.1620.7171.0000.933
종료층0.7570.2930.7400.9331.000
2023-12-12T18:08:36.436017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명출입구번호시작층상하행구분종료층
역명1.0000.4910.3930.0000.370
출입구번호0.4911.0000.3700.0000.380
시작층0.3930.3701.0000.1200.525
상하행구분0.0000.0000.1201.0000.220
종료층0.3700.3800.5250.2201.000
2023-12-12T18:08:36.630562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호시작층종료층
역명1.0000.0000.4910.3930.370
상하행구분0.0001.0000.0000.1200.220
출입구번호0.4910.0001.0000.3700.380
시작층0.3930.1200.3701.0000.525
종료층0.3700.2200.3800.5251.000

Missing values

2023-12-12T18:08:33.356670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:08:33.506062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
0부산교통공사3호선강서구청하행1(2F) 1번 출입구 근처지상2지상1
1부산교통공사3호선강서구청상행1(1F) 1번 출입구 근처지상1지상2
2부산교통공사3호선강서구청상행2(1F) 2번 출입구 근처지상1지상2
3부산교통공사3호선강서구청하행2(2F) 2번 출입구 근처지상2지상1
4부산교통공사3호선강서구청하행<NA>(4F) 대합실 고객센터 맞은편지상4지상2
5부산교통공사3호선강서구청상행<NA>(2F) 대합실 8번 창고 앞지상2지상4
6부산교통공사3호선강서구청상행5(1F) 5번 출입구 근처지상1지상4
7부산교통공사3호선강서구청하행5(4F) 5번 출입구 근처지상4지상1
8부산교통공사3호선강서구청상행<NA>(4F) 1번대 표내는 곳 내 3번 계단 맞은편지상4지상5
9부산교통공사3호선강서구청하행<NA>(5F) 구포역 방향 2-4 출읿문 앞지상5지상4
철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
164부산교통공사3호선체육공원하행21층 2번출구 옆지상1지상2
165부산교통공사3호선체육공원상행21층 2번출구 옆지상1지상2
166부산교통공사3호선체육공원상행21층 주차장 중간지상1지상2
167부산교통공사3호선체육공원하행21층 주차장 중간지상1지상2
168부산교통공사3호선체육공원하행32층지상2지상3
169부산교통공사3호선체육공원상행32층 대합실 중간지상2지상3
170부산교통공사3호선체육공원상행<NA>대합실 남쪽(1번대 게이트 옆)지상3지상4
171부산교통공사3호선체육공원하행<NA>대합실 남쪽(1번대 게이트 옆)지상3지상4
172부산교통공사3호선체육공원상행<NA>대합실 남쪽(1번대 게이트 옆)지상3지상4
173부산교통공사3호선체육공원하행<NA>대합실 남쪽(1번대 게이트 옆)지상3지상4

Duplicate rows

Most frequently occurring

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층# duplicates
0부산교통공사3호선만덕상행<NA>지하7층 > 지하5층지하7지하52
1부산교통공사3호선만덕상행<NA>지하9층 > 지하7층지하9지하72
2부산교통공사3호선만덕하행<NA>지하5층 > 지하7층지하5지하72
3부산교통공사3호선만덕하행<NA>지하7층 > 지하9층지하7지하92
4부산교통공사3호선배산상행<NA>(B4) 대합실지하4지하12
5부산교통공사3호선배산상행<NA>(B7) 대합실지하7지하42
6부산교통공사3호선배산하행<NA>(B4) 대합실지하4지하72
7부산교통공사3호선배산하행<NA>(B7) 대합실지하7지하82
8부산교통공사3호선연산하행<NA>(B2)교대역 방향 환승통로지하2지하42
9부산교통공사3호선연산하행<NA>(B2)시청역 방향 환승통로지하2지하42