Overview

Dataset statistics

Number of variables8
Number of observations202
Missing cells139
Missing cells (%)8.6%
Duplicate rows10
Duplicate rows (%)5.0%
Total size in memory13.0 KiB
Average record size in memory65.7 B

Variable types

Categorical6
Numeric1
Text1

Dataset

Description부산2호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 상하행구분, 출입구번호, 상세위치, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041354/fileData.do

Alerts

철도운영기관 has constant value ""Constant
선명 has constant value ""Constant
Dataset has 10 (5.0%) duplicate rowsDuplicates
출입구번호 is highly overall correlated with 역명High correlation
역명 is highly overall correlated with 출입구번호 and 2 other fieldsHigh correlation
상하행구분 is highly overall correlated with 시작층 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
종료층 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
출입구번호 has 139 (68.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:55:10.112830
Analysis finished2023-12-12 12:55:10.760593
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
부산교통공사
202 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 202
100.0%

Length

2023-12-12T21:55:10.823217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:55:10.926366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 202
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2호선
202 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2호선
2nd row2호선
3rd row2호선
4th row2호선
5th row2호선

Common Values

ValueCountFrequency (%)
2호선 202
100.0%

Length

2023-12-12T21:55:11.020927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:55:11.124271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 202
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)14.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
센텀시티(BEXCO·신세계)
 
14
양산(시청·동원과학기술대학교)
 
12
남양산(범어)
 
10
민락
 
8
금련산
 
8
Other values (24)
150 

Length

Max length16
Median length15
Mean length6.2673267
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가야
2nd row가야
3rd row가야
4th row가야
5th row가야

Common Values

ValueCountFrequency (%)
센텀시티(BEXCO·신세계) 14
 
6.9%
양산(시청·동원과학기술대학교) 12
 
5.9%
남양산(범어) 10
 
5.0%
민락 8
 
4.0%
금련산 8
 
4.0%
구명 8
 
4.0%
구남 8
 
4.0%
광안 8
 
4.0%
남천(KBS·수영구청) 8
 
4.0%
대연(고려병원) 8
 
4.0%
Other values (19) 110
54.5%

Length

2023-12-12T21:55:11.263984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
센텀시티(bexco·신세계 14
 
6.9%
양산(시청·동원과학기술대학교 12
 
5.9%
남양산(범어 10
 
5.0%
못골(남구청 8
 
4.0%
가야 8
 
4.0%
동의대 8
 
4.0%
지게골 8
 
4.0%
증산 8
 
4.0%
개금 8
 
4.0%
문현 8
 
4.0%
Other values (19) 110
54.5%

상하행구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
상행
104 
하행
98 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상행
2nd row하행
3rd row하행
4th row상행
5th row상행

Common Values

ValueCountFrequency (%)
상행 104
51.5%
하행 98
48.5%

Length

2023-12-12T21:55:11.408740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:55:11.539220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 104
51.5%
하행 98
48.5%

출입구번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)15.9%
Missing139
Missing (%)68.8%
Infinite0
Infinite (%)0.0%
Mean4.2539683
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T21:55:11.639431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q37
95-th percentile9
Maximum15
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.3647521
Coefficient of variation (CV)0.79096784
Kurtosis1.4841663
Mean4.2539683
Median Absolute Deviation (MAD)2
Skewness1.2130849
Sum268
Variance11.321557
MonotonicityNot monotonic
2023-12-12T21:55:11.765679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 17
 
8.4%
3 13
 
6.4%
2 7
 
3.5%
7 6
 
3.0%
8 6
 
3.0%
5 5
 
2.5%
6 4
 
2.0%
9 2
 
1.0%
15 2
 
1.0%
11 1
 
0.5%
(Missing) 139
68.8%
ValueCountFrequency (%)
1 17
8.4%
2 7
3.5%
3 13
6.4%
5 5
 
2.5%
6 4
 
2.0%
7 6
 
3.0%
8 6
 
3.0%
9 2
 
1.0%
11 1
 
0.5%
15 2
 
1.0%
ValueCountFrequency (%)
15 2
 
1.0%
11 1
 
0.5%
9 2
 
1.0%
8 6
 
3.0%
7 6
 
3.0%
6 4
 
2.0%
5 5
 
2.5%
3 13
6.4%
2 7
3.5%
1 17
8.4%
Distinct174
Distinct (%)86.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T21:55:12.126423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length29.5
Mean length21.049505
Min length8

Characters and Unicode

Total characters4252
Distinct characters134
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique151 ?
Unique (%)74.8%

Sample

1st row(B3) 부암역 방향 승강장 5-1 출입문 앞
2nd row(B1) 표내는 곳 내 1/3번 출입구 방향
3rd row(B1) 표내는 곳 내 2/4번 출입구 방향
4th row(B3) 동의대역 방향 승강장 6-4 출입문 앞
5th row(B3) 부암역 방향 승강장 1-1 출입문 앞
ValueCountFrequency (%)
방향 121
 
10.1%
80
 
6.7%
출입구 76
 
6.4%
출입문 69
 
5.8%
b1 62
 
5.2%
47
 
3.9%
44
 
3.7%
승강장 42
 
3.5%
41
 
3.4%
b3 37
 
3.1%
Other values (142) 574
48.1%
2023-12-12T21:55:12.633148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1013
23.8%
( 191
 
4.5%
) 191
 
4.5%
1 186
 
4.4%
158
 
3.7%
157
 
3.7%
B 147
 
3.5%
129
 
3.0%
121
 
2.8%
113
 
2.7%
Other values (124) 1846
43.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2017
47.4%
Space Separator 1013
23.8%
Decimal Number 529
 
12.4%
Uppercase Letter 203
 
4.8%
Open Punctuation 191
 
4.5%
Close Punctuation 191
 
4.5%
Dash Punctuation 82
 
1.9%
Other Punctuation 17
 
0.4%
Math Symbol 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
158
 
7.8%
157
 
7.8%
129
 
6.4%
121
 
6.0%
113
 
5.6%
107
 
5.3%
93
 
4.6%
86
 
4.3%
82
 
4.1%
80
 
4.0%
Other values (100) 891
44.2%
Decimal Number
ValueCountFrequency (%)
1 186
35.2%
2 99
18.7%
3 88
16.6%
4 68
 
12.9%
5 30
 
5.7%
6 24
 
4.5%
0 19
 
3.6%
7 8
 
1.5%
8 5
 
0.9%
9 2
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
B 147
72.4%
F 48
 
23.6%
L 3
 
1.5%
E 3
 
1.5%
C 1
 
0.5%
I 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 16
94.1%
. 1
 
5.9%
Math Symbol
ValueCountFrequency (%)
> 5
55.6%
~ 4
44.4%
Space Separator
ValueCountFrequency (%)
1013
100.0%
Open Punctuation
ValueCountFrequency (%)
( 191
100.0%
Close Punctuation
ValueCountFrequency (%)
) 191
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 82
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2032
47.8%
Hangul 2017
47.4%
Latin 203
 
4.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
158
 
7.8%
157
 
7.8%
129
 
6.4%
121
 
6.0%
113
 
5.6%
107
 
5.3%
93
 
4.6%
86
 
4.3%
82
 
4.1%
80
 
4.0%
Other values (100) 891
44.2%
Common
ValueCountFrequency (%)
1013
49.9%
( 191
 
9.4%
) 191
 
9.4%
1 186
 
9.2%
2 99
 
4.9%
3 88
 
4.3%
- 82
 
4.0%
4 68
 
3.3%
5 30
 
1.5%
6 24
 
1.2%
Other values (8) 60
 
3.0%
Latin
ValueCountFrequency (%)
B 147
72.4%
F 48
 
23.6%
L 3
 
1.5%
E 3
 
1.5%
C 1
 
0.5%
I 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2235
52.6%
Hangul 2017
47.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1013
45.3%
( 191
 
8.5%
) 191
 
8.5%
1 186
 
8.3%
B 147
 
6.6%
2 99
 
4.4%
3 88
 
3.9%
- 82
 
3.7%
4 68
 
3.0%
F 48
 
2.1%
Other values (14) 122
 
5.5%
Hangul
ValueCountFrequency (%)
158
 
7.8%
157
 
7.8%
129
 
6.4%
121
 
6.0%
113
 
5.6%
107
 
5.3%
93
 
4.6%
86
 
4.3%
82
 
4.1%
80
 
4.0%
Other values (100) 891
44.2%

시작층
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
지하1
65 
지하3
50 
지상1
23 
지하2
22 
지상2
17 
Other values (3)
25 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row지하3
2nd row지하1
3rd row지하1
4th row지하3
5th row지하3

Common Values

ValueCountFrequency (%)
지하1 65
32.2%
지하3 50
24.8%
지상1 23
 
11.4%
지하2 22
 
10.9%
지상2 17
 
8.4%
지하4 16
 
7.9%
지상3 8
 
4.0%
지상4 1
 
0.5%

Length

2023-12-12T21:55:12.809756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:55:12.983870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 65
32.2%
지하3 50
24.8%
지상1 23
 
11.4%
지하2 22
 
10.9%
지상2 17
 
8.4%
지하4 16
 
7.9%
지상3 8
 
4.0%
지상4 1
 
0.5%

종료층
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
지하1
70 
지하3
40 
지하2
24 
지상1
24 
지상2
19 
Other values (3)
25 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row지하1
2nd row지하3
3rd row지하3
4th row지하1
5th row지하1

Common Values

ValueCountFrequency (%)
지하1 70
34.7%
지하3 40
19.8%
지하2 24
 
11.9%
지상1 24
 
11.9%
지상2 19
 
9.4%
지하4 16
 
7.9%
지상3 8
 
4.0%
지상4 1
 
0.5%

Length

2023-12-12T21:55:13.148142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:55:13.272590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 70
34.7%
지하3 40
19.8%
지하2 24
 
11.9%
지상1 24
 
11.9%
지상2 19
 
9.4%
지하4 16
 
7.9%
지상3 8
 
4.0%
지상4 1
 
0.5%

Interactions

2023-12-12T21:55:10.430579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:55:13.403491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명상하행구분출입구번호시작층종료층
역명1.0000.0000.8710.8530.854
상하행구분0.0001.0000.0000.7720.820
출입구번호0.8710.0001.0000.2400.127
시작층0.8530.7720.2401.0000.959
종료층0.8540.8200.1270.9591.000
2023-12-12T21:55:13.850294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명시작층상하행구분종료층
역명1.0000.5210.0000.523
시작층0.5211.0000.5870.679
상하행구분0.0000.5871.0000.630
종료층0.5230.6790.6301.000
2023-12-12T21:55:13.958400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출입구번호역명상하행구분시작층종료층
출입구번호1.0000.6260.0000.1700.127
역명0.6261.0000.0000.5210.523
상하행구분0.0000.0001.0000.5870.630
시작층0.1700.5210.5871.0000.679
종료층0.1270.5230.6300.6791.000

Missing values

2023-12-12T21:55:10.556044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:55:10.706133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
0부산교통공사2호선가야상행<NA>(B3) 부암역 방향 승강장 5-1 출입문 앞지하3지하1
1부산교통공사2호선가야하행<NA>(B1) 표내는 곳 내 1/3번 출입구 방향지하1지하3
2부산교통공사2호선가야하행<NA>(B1) 표내는 곳 내 2/4번 출입구 방향지하1지하3
3부산교통공사2호선가야상행<NA>(B3) 동의대역 방향 승강장 6-4 출입문 앞지하3지하1
4부산교통공사2호선가야상행<NA>(B3) 부암역 방향 승강장 1-1 출입문 앞지하3지하1
5부산교통공사2호선가야하행<NA>(B1) 표내는 곳 내 3번 출입구 방향지하1지하3
6부산교통공사2호선가야하행<NA>(B1) 표내는 곳 내 4번 출입구 방향지하1지하3
7부산교통공사2호선가야상행<NA>(B3) 동의대역 방향 승강장 2-4 출입문 앞지하3지하1
8부산교통공사2호선개금상행<NA>(B3) 동의대역 방향 4-3 출입문 앞지하3지하1
9부산교통공사2호선개금하행<NA>(B1) 13번 출입구 쪽 표 내는 곳 내지하1지하3
철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층
192부산교통공사2호선지게골상행<NA>(B4) 못골역 방면 승강장 1-1출입문앞지하4지하2
193부산교통공사2호선지게골하행<NA>(B2) 전기실 앞지하2지하4
194부산교통공사2호선지게골하행<NA>(B2) 신호기기실 앞지하2지하4
195부산교통공사2호선지게골상행<NA>(B4) 문현역 방면 승강장 2-4출입문 앞지하4지하2
196부산교통공사2호선호포상행3(2F) 3번 출입구 근처지상2지상4
197부산교통공사2호선호포상행<NA>(4F) 개찰구지상4지상2
198부산교통공사2호선화명상행1(B1) 대합실 1번 출입구 방향 > (1F) 1번 출입구 앞지하1지상1
199부산교통공사2호선화명하행1(1F) 1번 출입구 앞 > (B1) 대합실 1번 출입구 방향지상1지하1
200부산교통공사2호선화명상행2(B1) 대합실 2번 출입구 방향 > (1F) 2번 출입구 앞지하1지상1
201부산교통공사2호선화명하행2(1F) 2번 출입구 앞 > (B1) 대합실 2번 출입구 방향지상1지하1

Duplicate rows

Most frequently occurring

철도운영기관선명역명상하행구분출입구번호상세위치시작층종료층# duplicates
0부산교통공사2호선구남하행<NA>(B1)표 내는 곳 내지하1지하34
7부산교통공사2호선센텀시티(BEXCO·신세계)하행<NA>(B1) 표 내는 곳 계단 옆지하1지하23
1부산교통공사2호선남양산(범어)상행<NA>(2F) 2층 대합실 10번대 표 내는 곳지상2지상32
2부산교통공사2호선대연(고려병원)하행<NA>(B1) 1~4 표내는 곳 안지하1지하32
3부산교통공사2호선동의대하행<NA>(B2) 5번/7번 출입구 쪽 표내는 곳 내지하2지하42
4부산교통공사2호선동의대하행<NA>(B2)1번 출입구 쪽 표 내는 곳 내지하2지하42
5부산교통공사2호선민락하행<NA>(B1) 표 내는 곳 내 1번/2번 출입구 방향지하1지하32
6부산교통공사2호선민락하행<NA>(B1) 표 내는 곳 내 3번/4번 출입구 방향지하1지하32
8부산교통공사2호선증산상행1(1F) 1번 출입구 근처지상1지상22
9부산교통공사2호선증산하행1(1F) 1번 출입구 근처지상1지상22