Overview

Dataset statistics

Number of variables7
Number of observations114
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.5 KiB
Average record size in memory58.2 B

Variable types

Categorical4
Text3

Dataset

Description부산교통공사에서 관리하는 도시광역철도역들의 철도운영기관명, 선명, 역명, 지상지하, 역층, 출구번호, 상세위치, 전화번호 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041399/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
역층 is highly imbalanced (56.6%)Imbalance
전화번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:42:10.421695
Analysis finished2023-12-12 07:42:10.967454
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
부산교통공사
114 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 114
100.0%

Length

2023-12-12T16:42:11.037977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:42:11.153457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 114
100.0%

선명
Categorical

Distinct4
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2호선
43 
1호선
40 
3호선
17 
4호선
14 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
2호선 43
37.7%
1호선 40
35.1%
3호선 17
 
14.9%
4호선 14
 
12.3%

Length

2023-12-12T16:42:11.282234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:42:11.425204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 43
37.7%
1호선 40
35.1%
3호선 17
 
14.9%
4호선 14
 
12.3%

역명
Text

Distinct108
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T16:42:11.728607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length2
Mean length4.5
Min length2

Characters and Unicode

Total characters513
Distinct characters169
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)89.5%

Sample

1st row괴정
2nd row교대
3rd row구서
4th row남산(부산외국대학교)
5th row남포
ValueCountFrequency (%)
덕천(부산과기대 2
 
1.8%
미남 2
 
1.8%
수영 2
 
1.8%
연산 2
 
1.8%
동래 2
 
1.8%
서면 2
 
1.8%
지게골 1
 
0.9%
전포 1
 
0.9%
주례 1
 
0.9%
중동 1
 
0.9%
Other values (98) 98
86.0%
2023-12-12T16:42:12.206620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 30
 
5.8%
) 30
 
5.8%
26
 
5.1%
25
 
4.9%
17
 
3.3%
13
 
2.5%
10
 
1.9%
10
 
1.9%
9
 
1.8%
9
 
1.8%
Other values (159) 334
65.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 438
85.4%
Open Punctuation 30
 
5.8%
Close Punctuation 30
 
5.8%
Uppercase Letter 8
 
1.6%
Other Punctuation 7
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
5.9%
25
 
5.7%
17
 
3.9%
13
 
3.0%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
9
 
2.1%
8
 
1.8%
Other values (149) 302
68.9%
Uppercase Letter
ValueCountFrequency (%)
B 2
25.0%
C 1
12.5%
O 1
12.5%
X 1
12.5%
E 1
12.5%
S 1
12.5%
K 1
12.5%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Other Punctuation
ValueCountFrequency (%)
· 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 438
85.4%
Common 67
 
13.1%
Latin 8
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
5.9%
25
 
5.7%
17
 
3.9%
13
 
3.0%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
9
 
2.1%
8
 
1.8%
Other values (149) 302
68.9%
Latin
ValueCountFrequency (%)
B 2
25.0%
C 1
12.5%
O 1
12.5%
X 1
12.5%
E 1
12.5%
S 1
12.5%
K 1
12.5%
Common
ValueCountFrequency (%)
( 30
44.8%
) 30
44.8%
· 7
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 438
85.4%
ASCII 68
 
13.3%
None 7
 
1.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 30
44.1%
) 30
44.1%
B 2
 
2.9%
C 1
 
1.5%
O 1
 
1.5%
X 1
 
1.5%
E 1
 
1.5%
S 1
 
1.5%
K 1
 
1.5%
Hangul
ValueCountFrequency (%)
26
 
5.9%
25
 
5.7%
17
 
3.9%
13
 
3.0%
10
 
2.3%
10
 
2.3%
9
 
2.1%
9
 
2.1%
9
 
2.1%
8
 
1.8%
Other values (149) 302
68.9%
None
ValueCountFrequency (%)
· 7
100.0%

지상지하
Categorical

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
지하
90 
지상
24 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지상
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지하 90
78.9%
지상 24
 
21.1%

Length

2023-12-12T16:42:12.330604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:42:12.408189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 90
78.9%
지상 24
 
21.1%

역층
Categorical

IMBALANCE 

Distinct4
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
1
92 
2
18 
4
 
3
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 92
80.7%
2 18
 
15.8%
4 3
 
2.6%
3 1
 
0.9%

Length

2023-12-12T16:42:12.488085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:42:12.567470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 92
80.7%
2 18
 
15.8%
4 3
 
2.6%
3 1
 
0.9%
Distinct89
Distinct (%)78.1%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T16:42:12.780799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length28
Mean length18.289474
Min length10

Characters and Unicode

Total characters2085
Distinct characters104
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79 ?
Unique (%)69.3%

Sample

1st row(B1) 만남의 장소/ 전기실 사이
2nd row(B1) 연산역 방향 승강장 표 내는 곳 옆
3rd row(2F) 대합실 중앙
4th row(B1) 대합실 중앙/ 화장실 맞은편
5th row(B1) 대합실 2번 출입구 방향
ValueCountFrequency (%)
대합실 79
 
13.9%
b1 78
 
13.7%
출입구 64
 
11.3%
방향 58
 
10.2%
19
 
3.3%
1번 14
 
2.5%
14
 
2.5%
2번 12
 
2.1%
1f 11
 
1.9%
4번 10
 
1.8%
Other values (89) 209
36.8%
2023-12-12T16:42:13.109937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
459
22.0%
1 128
 
6.1%
( 114
 
5.5%
) 114
 
5.5%
89
 
4.3%
B 89
 
4.3%
88
 
4.2%
86
 
4.1%
80
 
3.8%
73
 
3.5%
Other values (94) 765
36.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 995
47.7%
Space Separator 459
22.0%
Decimal Number 238
 
11.4%
Uppercase Letter 126
 
6.0%
Open Punctuation 114
 
5.5%
Close Punctuation 114
 
5.5%
Other Punctuation 35
 
1.7%
Math Symbol 3
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
89
 
8.9%
88
 
8.8%
86
 
8.6%
80
 
8.0%
73
 
7.3%
70
 
7.0%
68
 
6.8%
61
 
6.1%
61
 
6.1%
22
 
2.2%
Other values (72) 297
29.8%
Decimal Number
ValueCountFrequency (%)
1 128
53.8%
2 42
 
17.6%
3 22
 
9.2%
4 15
 
6.3%
0 10
 
4.2%
5 8
 
3.4%
6 7
 
2.9%
7 4
 
1.7%
8 2
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
B 89
70.6%
F 24
 
19.0%
E 5
 
4.0%
L 4
 
3.2%
S 2
 
1.6%
G 2
 
1.6%
Other Punctuation
ValueCountFrequency (%)
/ 34
97.1%
. 1
 
2.9%
Space Separator
ValueCountFrequency (%)
459
100.0%
Open Punctuation
ValueCountFrequency (%)
( 114
100.0%
Close Punctuation
ValueCountFrequency (%)
) 114
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 995
47.7%
Common 964
46.2%
Latin 126
 
6.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
89
 
8.9%
88
 
8.8%
86
 
8.6%
80
 
8.0%
73
 
7.3%
70
 
7.0%
68
 
6.8%
61
 
6.1%
61
 
6.1%
22
 
2.2%
Other values (72) 297
29.8%
Common
ValueCountFrequency (%)
459
47.6%
1 128
 
13.3%
( 114
 
11.8%
) 114
 
11.8%
2 42
 
4.4%
/ 34
 
3.5%
3 22
 
2.3%
4 15
 
1.6%
0 10
 
1.0%
5 8
 
0.8%
Other values (6) 18
 
1.9%
Latin
ValueCountFrequency (%)
B 89
70.6%
F 24
 
19.0%
E 5
 
4.0%
L 4
 
3.2%
S 2
 
1.6%
G 2
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1090
52.3%
Hangul 995
47.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
459
42.1%
1 128
 
11.7%
( 114
 
10.5%
) 114
 
10.5%
B 89
 
8.2%
2 42
 
3.9%
/ 34
 
3.1%
F 24
 
2.2%
3 22
 
2.0%
4 15
 
1.4%
Other values (12) 49
 
4.5%
Hangul
ValueCountFrequency (%)
89
 
8.9%
88
 
8.8%
86
 
8.6%
80
 
8.0%
73
 
7.3%
70
 
7.0%
68
 
6.8%
61
 
6.1%
61
 
6.1%
22
 
2.2%
Other values (72) 297
29.8%

전화번호
Text

UNIQUE 

Distinct114
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
2023-12-12T16:42:13.344325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1368
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114 ?
Unique (%)100.0%

Sample

1st row051-678-6105
2nd row051-678-6124
3rd row051-678-6130
4th row051-678-6132
5th row051-678-6111
ValueCountFrequency (%)
051-678-6105 1
 
0.9%
051-678-6306 1
 
0.9%
051-678-6235 1
 
0.9%
051-678-6239 1
 
0.9%
051-678-6203 1
 
0.9%
051-678-6215 1
 
0.9%
051-678-6240 1
 
0.9%
051-678-6202 1
 
0.9%
051-678-6225 1
 
0.9%
051-678-6218 1
 
0.9%
Other values (104) 104
91.2%
2023-12-12T16:42:13.807288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 239
17.5%
- 228
16.7%
1 200
14.6%
0 161
11.8%
5 125
9.1%
7 125
9.1%
8 124
9.1%
2 76
 
5.6%
3 45
 
3.3%
4 30
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1140
83.3%
Dash Punctuation 228
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 239
21.0%
1 200
17.5%
0 161
14.1%
5 125
11.0%
7 125
11.0%
8 124
10.9%
2 76
 
6.7%
3 45
 
3.9%
4 30
 
2.6%
9 15
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1368
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 239
17.5%
- 228
16.7%
1 200
14.6%
0 161
11.8%
5 125
9.1%
7 125
9.1%
8 124
9.1%
2 76
 
5.6%
3 45
 
3.3%
4 30
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 239
17.5%
- 228
16.7%
1 200
14.6%
0 161
11.8%
5 125
9.1%
7 125
9.1%
8 124
9.1%
2 76
 
5.6%
3 45
 
3.3%
4 30
 
2.2%

Correlations

2023-12-12T16:42:13.936869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명지상지하역층상세위치
선명1.0000.2000.2650.393
지상지하0.2001.0000.6381.000
역층0.2650.6381.0001.000
상세위치0.3931.0001.0001.000
2023-12-12T16:42:14.038722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명역층지상지하
선명1.0000.1050.131
역층0.1051.0000.441
지상지하0.1310.4411.000
2023-12-12T16:42:14.501585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명지상지하역층
선명1.0000.1310.105
지상지하0.1311.0000.441
역층0.1050.4411.000

Missing values

2023-12-12T16:42:10.791450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:42:10.923913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상지하역층상세위치전화번호
0부산교통공사1호선괴정지하1(B1) 만남의 장소/ 전기실 사이051-678-6105
1부산교통공사1호선교대지하1(B1) 연산역 방향 승강장 표 내는 곳 옆051-678-6124
2부산교통공사1호선구서지상2(2F) 대합실 중앙051-678-6130
3부산교통공사1호선남산(부산외국대학교)지하1(B1) 대합실 중앙/ 화장실 맞은편051-678-6132
4부산교통공사1호선남포지하1(B1) 대합실 2번 출입구 방향051-678-6111
5부산교통공사1호선낫개지하1(B1) 대합실 6번 출입구 방향051-678-6197
6부산교통공사1호선노포(종합버스터미널)지상2(2F) 대합실 1번 출입구 맞은편051-678-6134
7부산교통공사1호선다대포항지하1(B1) 대합실 1/ 2번 출입구 방향051-678-6196
8부산교통공사1호선다대포해수욕장지하1(B1) 대합실 2/ 4번 출입구 방향051-678-6195
9부산교통공사1호선당리(사하구청)지하1(B1) 대합실 중간051-678-6103
철도운영기관명선명역명지상지하역층상세위치전화번호
104부산교통공사4호선명장지하1(B1) 대합실 3번 출입구 방향051-678-6406
105부산교통공사4호선미남지하1(B1) 1번/5번 출구 중간지점051-678-6401
106부산교통공사4호선반여농산물시장지상1(1F) 1번 출입구 표 내는 곳 앞051-678-6409
107부산교통공사4호선서동지하1(B1) 대합실 2번 출입구 방향051-678-6407
108부산교통공사4호선석대지상1(1F) 2번 출입구 앞051-678-6410
109부산교통공사4호선수안지하1(B1) 대합실 7번 출입구 방향051-678-6403
110부산교통공사4호선안평(고촌주택단지)지상2(2F) 표내는 곳 옆 / 1/3번 출입구 방향051-678-6414
111부산교통공사4호선영산대(아랫반송)지상2(2F) 대합실 2번 출입구 방향051-678-6411
112부산교통공사4호선윗반송지상2(2F) 대합실 1/3번 출입구 방향051-678-6412
113부산교통공사4호선충렬사(안락)지하1(B1) 대합실 3번 출입구 방향051-678-6405