Overview

Dataset statistics

Number of variables7
Number of observations446
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)1.1%
Total size in memory25.4 KiB
Average record size in memory58.3 B

Variable types

Categorical5
Text2

Dataset

Description부산교통공사에서 운영중인 노선의 엘리베이터에 대한 데이터로 철도운영기관명, 선명, 역명, 출입구번호, 상세위치, 정원인원, 정원중량의데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041375/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
Dataset has 5 (1.1%) duplicate rowsDuplicates
정원_중량(kg) is highly overall correlated with 정원_인원High correlation
정원_인원 is highly overall correlated with 정원_중량(kg)High correlation
정원_인원 is highly imbalanced (66.2%)Imbalance
정원_중량(kg) is highly imbalanced (66.2%)Imbalance

Reproduction

Analysis started2023-12-12 04:53:20.317381
Analysis finished2023-12-12 04:53:20.948078
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
부산교통공사
446 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 446
100.0%

Length

2023-12-12T13:53:21.015510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:53:21.116127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 446
100.0%

선명
Categorical

Distinct4
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2호선
172 
1호선
159 
3호선
68 
4호선
47 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
2호선 172
38.6%
1호선 159
35.7%
3호선 68
 
15.2%
4호선 47
 
10.5%

Length

2023-12-12T13:53:21.229904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:53:21.357776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 172
38.6%
1호선 159
35.7%
3호선 68
 
15.2%
4호선 47
 
10.5%

역명
Text

Distinct111
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2023-12-12T13:53:21.667745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.5381166
Min length2

Characters and Unicode

Total characters1132
Distinct characters133
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.7%

Sample

1st row노포
2nd row노포
3rd row범어사
4th row범어사
5th row범어사
ValueCountFrequency (%)
동매 10
 
2.2%
벡스코 7
 
1.6%
만덕 7
 
1.6%
덕천 7
 
1.6%
동래 6
 
1.3%
범내골 6
 
1.3%
거제 6
 
1.3%
다대포해수욕장 6
 
1.3%
배산 6
 
1.3%
중동 6
 
1.3%
Other values (101) 379
85.0%
2023-12-12T13:53:22.185624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60
 
5.3%
58
 
5.1%
46
 
4.1%
33
 
2.9%
31
 
2.7%
31
 
2.7%
26
 
2.3%
24
 
2.1%
23
 
2.0%
21
 
1.9%
Other values (123) 779
68.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1113
98.3%
Decimal Number 19
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
60
 
5.4%
58
 
5.2%
46
 
4.1%
33
 
3.0%
31
 
2.8%
31
 
2.8%
26
 
2.3%
24
 
2.2%
23
 
2.1%
21
 
1.9%
Other values (120) 760
68.3%
Decimal Number
ValueCountFrequency (%)
2 9
47.4%
1 7
36.8%
3 3
 
15.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1113
98.3%
Common 19
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
60
 
5.4%
58
 
5.2%
46
 
4.1%
33
 
3.0%
31
 
2.8%
31
 
2.8%
26
 
2.3%
24
 
2.2%
23
 
2.1%
21
 
1.9%
Other values (120) 760
68.3%
Common
ValueCountFrequency (%)
2 9
47.4%
1 7
36.8%
3 3
 
15.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1113
98.3%
ASCII 19
 
1.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
60
 
5.4%
58
 
5.2%
46
 
4.1%
33
 
3.0%
31
 
2.8%
31
 
2.8%
26
 
2.3%
24
 
2.2%
23
 
2.1%
21
 
1.9%
Other values (120) 760
68.3%
ASCII
ValueCountFrequency (%)
2 9
47.4%
1 7
36.8%
3 3
 
15.8%

출입구번호
Categorical

Distinct34
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
<NA>
170 
3번
47 
4번
46 
1번
39 
2번
31 
Other values (29)
113 

Length

Max length7
Median length6
Mean length3.0358744
Min length1

Unique

Unique10 ?
Unique (%)2.2%

Sample

1st row2번
2nd row2번
3rd row3번
4th row4번
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 170
38.1%
3번 47
 
10.5%
4번 46
 
10.3%
1번 39
 
8.7%
2번 31
 
7.0%
5번 23
 
5.2%
6번 17
 
3.8%
2번4번 8
 
1.8%
7번 7
 
1.6%
8번 6
 
1.3%
Other values (24) 52
 
11.7%

Length

2023-12-12T13:53:22.360673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 170
36.6%
4번 52
 
11.2%
3번 51
 
11.0%
1번 44
 
9.5%
2번 37
 
8.0%
5번 24
 
5.2%
6번 21
 
4.5%
8번 10
 
2.2%
2번4번 8
 
1.7%
7번 8
 
1.7%
Other values (15) 39
 
8.4%
Distinct432
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2023-12-12T13:53:22.690245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length65
Mean length37.352018
Min length4

Characters and Unicode

Total characters16659
Distinct characters267
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique420 ?
Unique (%)94.2%

Sample

1st row(2F) 2번 출입구 앞 (1F) 밤어사역 방향 승강장 6-1 출입문 앞
2nd row(2F) 2번 출입구 앞 (1F) 노포행 승강장 6-1 출입문 앞
3rd row(1F) 3번출입구 (B2) 1번/3번 출입구 방향
4th row(1F) 4번출입구(B2) 2번/4번 출입구 방향
5th row(B2) 남산역 방향 표내는 곳 내 1번/3번 출입구 방향(B3) 남산역 방향 승강장 8-1 출입문 앞
ValueCountFrequency (%)
출입구 420
 
9.9%
방향 321
 
7.6%
승강장 191
 
4.5%
b1 182
 
4.3%
177
 
4.2%
1f 176
 
4.2%
출입문 165
 
3.9%
154
 
3.6%
92
 
2.2%
사이 77
 
1.8%
Other values (507) 2278
53.8%
2023-12-12T13:53:23.233643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3841
23.1%
1 863
 
5.2%
( 843
 
5.1%
) 843
 
5.1%
655
 
3.9%
653
 
3.9%
615
 
3.7%
B 544
 
3.3%
506
 
3.0%
406
 
2.4%
Other values (257) 6890
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7862
47.2%
Space Separator 3841
23.1%
Decimal Number 2060
 
12.4%
Uppercase Letter 847
 
5.1%
Open Punctuation 843
 
5.1%
Close Punctuation 843
 
5.1%
Dash Punctuation 191
 
1.1%
Other Punctuation 154
 
0.9%
Math Symbol 13
 
0.1%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
655
 
8.3%
653
 
8.3%
615
 
7.8%
506
 
6.4%
406
 
5.2%
387
 
4.9%
268
 
3.4%
250
 
3.2%
231
 
2.9%
209
 
2.7%
Other values (228) 3682
46.8%
Decimal Number
ValueCountFrequency (%)
1 863
41.9%
2 400
19.4%
3 278
 
13.5%
4 202
 
9.8%
5 100
 
4.9%
6 81
 
3.9%
0 51
 
2.5%
7 37
 
1.8%
8 35
 
1.7%
9 13
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
B 544
64.2%
F 286
33.8%
S 4
 
0.5%
E 4
 
0.5%
G 3
 
0.4%
X 3
 
0.4%
L 1
 
0.1%
M 1
 
0.1%
A 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 144
93.5%
. 10
 
6.5%
Math Symbol
ValueCountFrequency (%)
> 11
84.6%
~ 2
 
15.4%
Lowercase Letter
ValueCountFrequency (%)
m 4
80.0%
x 1
 
20.0%
Space Separator
ValueCountFrequency (%)
3841
100.0%
Open Punctuation
ValueCountFrequency (%)
( 843
100.0%
Close Punctuation
ValueCountFrequency (%)
) 843
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 191
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7945
47.7%
Hangul 7862
47.2%
Latin 852
 
5.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
655
 
8.3%
653
 
8.3%
615
 
7.8%
506
 
6.4%
406
 
5.2%
387
 
4.9%
268
 
3.4%
250
 
3.2%
231
 
2.9%
209
 
2.7%
Other values (228) 3682
46.8%
Common
ValueCountFrequency (%)
3841
48.3%
1 863
 
10.9%
( 843
 
10.6%
) 843
 
10.6%
2 400
 
5.0%
3 278
 
3.5%
4 202
 
2.5%
- 191
 
2.4%
/ 144
 
1.8%
5 100
 
1.3%
Other values (8) 240
 
3.0%
Latin
ValueCountFrequency (%)
B 544
63.8%
F 286
33.6%
S 4
 
0.5%
E 4
 
0.5%
m 4
 
0.5%
G 3
 
0.4%
X 3
 
0.4%
L 1
 
0.1%
M 1
 
0.1%
x 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8797
52.8%
Hangul 7862
47.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3841
43.7%
1 863
 
9.8%
( 843
 
9.6%
) 843
 
9.6%
B 544
 
6.2%
2 400
 
4.5%
F 286
 
3.3%
3 278
 
3.2%
4 202
 
2.3%
- 191
 
2.2%
Other values (19) 506
 
5.8%
Hangul
ValueCountFrequency (%)
655
 
8.3%
653
 
8.3%
615
 
7.8%
506
 
6.4%
406
 
5.2%
387
 
4.9%
268
 
3.4%
250
 
3.2%
231
 
2.9%
209
 
2.7%
Other values (228) 3682
46.8%

정원_인원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
13
391 
15
 
35
10
 
18
8
 
2

Length

Max length2
Median length2
Mean length1.9955157
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10
2nd row10
3rd row13
4th row13
5th row13

Common Values

ValueCountFrequency (%)
13 391
87.7%
15 35
 
7.8%
10 18
 
4.0%
8 2
 
0.4%

Length

2023-12-12T13:53:23.415163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:53:23.528913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
13 391
87.7%
15 35
 
7.8%
10 18
 
4.0%
8 2
 
0.4%

정원_중량(kg)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
1000
391 
1150
 
35
750
 
18
630
 
2

Length

Max length4
Median length4
Mean length3.955157
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row750
2nd row750
3rd row1000
4th row1000
5th row1000

Common Values

ValueCountFrequency (%)
1000 391
87.7%
1150 35
 
7.8%
750 18
 
4.0%
630 2
 
0.4%

Length

2023-12-12T13:53:23.645679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:53:23.762642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1000 391
87.7%
1150 35
 
7.8%
750 18
 
4.0%
630 2
 
0.4%

Correlations

2023-12-12T13:53:23.856871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명출입구번호정원_인원정원_중량(kg)
선명1.0000.2670.3570.357
출입구번호0.2671.0000.4640.464
정원_인원0.3570.4641.0001.000
정원_중량(kg)0.3570.4641.0001.000
2023-12-12T13:53:23.971029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정원_중량(kg)출입구번호정원_인원선명
정원_중량(kg)1.0000.2301.0000.145
출입구번호0.2301.0000.2300.131
정원_인원1.0000.2301.0000.145
선명0.1450.1310.1451.000
2023-12-12T13:53:24.095327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명출입구번호정원_인원정원_중량(kg)
선명1.0000.1310.1450.145
출입구번호0.1311.0000.2300.230
정원_인원0.1450.2301.0001.000
정원_중량(kg)0.1450.2301.0001.000

Missing values

2023-12-12T13:53:20.733429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:53:20.896679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출입구번호상세위치정원_인원정원_중량(kg)
0부산교통공사1호선노포2번(2F) 2번 출입구 앞 (1F) 밤어사역 방향 승강장 6-1 출입문 앞10750
1부산교통공사1호선노포2번(2F) 2번 출입구 앞 (1F) 노포행 승강장 6-1 출입문 앞10750
2부산교통공사1호선범어사3번(1F) 3번출입구 (B2) 1번/3번 출입구 방향131000
3부산교통공사1호선범어사4번(1F) 4번출입구(B2) 2번/4번 출입구 방향131000
4부산교통공사1호선범어사<NA>(B2) 남산역 방향 표내는 곳 내 1번/3번 출입구 방향(B3) 남산역 방향 승강장 8-1 출입문 앞131000
5부산교통공사1호선범어사<NA>(B2) 노포역 방향 표내는 곳 내 2번/4번 출입구방향(B3) 노포역 방향 승강장 8-1출입문 앞131000
6부산교통공사1호선남산3번(1F) 3번 출입구 앞(B1) 10번 표내는 곳 옆131000
7부산교통공사1호선남산4번(1F) 4번 출입구 앞(B1) 20번 표내는 곳 옆131000
8부산교통공사1호선남산<NA>(B1) 15번 표내는 곳 앞(B2) 두실역 방향 승강장 5-3 출입문 앞131000
9부산교통공사1호선남산<NA>(B1) 25번 표내는 곳 앞 2번/4번 출입구 방향(B2) 범어사역 방향 승강장 8-2 출입문 앞131000
철도운영기관명선명역명출입구번호상세위치정원_인원정원_중량(kg)
436부산교통공사4호선영산대<NA>(2F) E/S 앞131000
437부산교통공사4호선윗반송3번(1F) 1번/3번 출입구 사이 엘리베이터(2F) 1번/3번 출입구 사이 엘리베이터 대합실 옆131000
438부산교통공사4호선윗반송4번(1F) 2번/4번 출입구 사이 엘리베이터(2F) 2번/4번 출입구 사이 엘리베이터 대합실 옆131000
439부산교통공사4호선윗반송<NA>(2F) 표내는곳 옆(3F) 승강장 1-2 출입문 앞131000
440부산교통공사4호선고촌1번(1F) 1번/3번 출입구 사이 엘리베이터(2F) 1번/3번 출입구 사이 엘리베이터 대합실 옆131000
441부산교통공사4호선고촌2번(1F) 2번/4번 출입구 사이 엘리베이터(2F) 2번/4번 출입구 사이 엘리베이터 대합실 옆131000
442부산교통공사4호선고촌<NA>(2F) 표내는곳 옆(3F) 승강장 1-2 출입문 앞131000
443부산교통공사4호선안평1번3번(1) 13번 출입구 사이 (2) 1번 출입구 계단 앞131000
444부산교통공사4호선안평2번4번(1) 24번 출입구 사이 (2) 2번 출입구 계단 앞131000
445부산교통공사4호선안평<NA>(2) 표내는 곳 안쪽 에스컬레이터 앞(3) 승강장 미남방향 1-1 옆131000

Duplicate rows

Most frequently occurring

철도운영기관명선명역명출입구번호상세위치정원_인원정원_중량(kg)# duplicates
2부산교통공사1호선동매6번(1F) 6번 출입구 앞(B1) 대합실 만남의 광장 근처1310003
0부산교통공사1호선당리3번(1F) 3번출입구 엘리베이터 자체가 3번 출입구로 지정(계단X)1310002
1부산교통공사1호선동매1번(1F) 1번 출입구 앞(B1) 대합실 고객센터 근처1310002
3부산교통공사3호선망미3번B1 10번대 표내는곳 방향 > B6 수영역 방향 승강장 4-2 출입문 앞1511502
4부산교통공사3호선망미4번B1 20번대 표내는곳 방향 > B6 배산역 방향 승강장 4-2 출입문 앞1511502