Overview

Dataset statistics

Number of variables10
Number of observations80
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.5 KiB
Average record size in memory83.7 B

Variable types

Categorical6
Text1
Boolean3

Dataset

Description부산교통공사에서 운영하는 부산1호선 노선의 승강장 정보에 대한 데이터로 철도운영기관명, 선명, 역명, 승강장번호, 상하행구분, 지상구분, 역층, 승강장연결 여부, 스크린도어 유무, 안전발판 유무의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041172/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
스크린도어 유무 has constant value ""Constant
안전발판 유무 has constant value ""Constant
승강장번호 is highly overall correlated with 상하행High correlation
상하행 is highly overall correlated with 승강장번호High correlation
역층 is highly imbalanced (66.2%)Imbalance

Reproduction

Analysis started2023-12-12 20:01:53.549585
Analysis finished2023-12-12 20:01:54.080128
Duration0.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size772.0 B
부산교통공사
80 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 80
100.0%

Length

2023-12-13T05:01:54.134517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:54.226724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 80
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size772.0 B
1호선
80 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 80
100.0%

Length

2023-12-13T05:01:54.328069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:54.416464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 80
100.0%

역명
Text

Distinct40
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size772.0 B
2023-12-13T05:01:54.604176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length2
Mean length4.025
Min length2

Characters and Unicode

Total characters322
Distinct characters86
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row괴정
2nd row괴정
3rd row교대
4th row교대
5th row구서
ValueCountFrequency (%)
괴정 2
 
2.5%
교대 2
 
2.5%
좌천 2
 
2.5%
시청(연제 2
 
2.5%
신평 2
 
2.5%
양정 2
 
2.5%
연산 2
 
2.5%
온천장 2
 
2.5%
자갈치 2
 
2.5%
장전(부산가톨릭대학교 2
 
2.5%
Other values (30) 60
75.0%
2023-12-13T05:01:54.936516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
6.2%
18
 
5.6%
( 16
 
5.0%
) 16
 
5.0%
16
 
5.0%
12
 
3.7%
8
 
2.5%
8
 
2.5%
8
 
2.5%
6
 
1.9%
Other values (76) 194
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 288
89.4%
Open Punctuation 16
 
5.0%
Close Punctuation 16
 
5.0%
Other Punctuation 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
6.9%
18
 
6.2%
16
 
5.6%
12
 
4.2%
8
 
2.8%
8
 
2.8%
8
 
2.8%
6
 
2.1%
6
 
2.1%
6
 
2.1%
Other values (73) 180
62.5%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 288
89.4%
Common 34
 
10.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
6.9%
18
 
6.2%
16
 
5.6%
12
 
4.2%
8
 
2.8%
8
 
2.8%
8
 
2.8%
6
 
2.1%
6
 
2.1%
6
 
2.1%
Other values (73) 180
62.5%
Common
ValueCountFrequency (%)
( 16
47.1%
) 16
47.1%
· 2
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 288
89.4%
ASCII 32
 
9.9%
None 2
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
 
6.9%
18
 
6.2%
16
 
5.6%
12
 
4.2%
8
 
2.8%
8
 
2.8%
8
 
2.8%
6
 
2.1%
6
 
2.1%
6
 
2.1%
Other values (73) 180
62.5%
ASCII
ValueCountFrequency (%)
( 16
50.0%
) 16
50.0%
None
ValueCountFrequency (%)
· 2
100.0%

승강장번호
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size772.0 B
1
40 
2
40 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 40
50.0%
2 40
50.0%

Length

2023-12-13T05:01:55.060565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:55.172051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 40
50.0%
2 40
50.0%

상하행
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size772.0 B
상행
40 
하행
40 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상행
2nd row하행
3rd row하행
4th row상행
5th row하행

Common Values

ValueCountFrequency (%)
상행 40
50.0%
하행 40
50.0%

Length

2023-12-13T05:01:55.274867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:55.369916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상행 40
50.0%
하행 40
50.0%

지상구분
Categorical

Distinct2
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size772.0 B
지하
66 
지상
14 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지상

Common Values

ValueCountFrequency (%)
지하 66
82.5%
지상 14
 
17.5%

Length

2023-12-13T05:01:55.464865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:55.568145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 66
82.5%
지상 14
 
17.5%

역층
Categorical

IMBALANCE 

Distinct5
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size772.0 B
2
70 
3
 
4
5
 
2
4
 
2
1
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 70
87.5%
3 4
 
5.0%
5 2
 
2.5%
4 2
 
2.5%
1 2
 
2.5%

Length

2023-12-13T05:01:55.664106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:01:55.795505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 70
87.5%
3 4
 
5.0%
5 2
 
2.5%
4 2
 
2.5%
1 2
 
2.5%
Distinct2
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size212.0 B
False
54 
True
26 
ValueCountFrequency (%)
False 54
67.5%
True 26
32.5%
2023-12-13T05:01:55.899189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

스크린도어 유무
Boolean

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size212.0 B
True
80 
ValueCountFrequency (%)
True 80
100.0%
2023-12-13T05:01:55.993531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

안전발판 유무
Boolean

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size212.0 B
True
80 
ValueCountFrequency (%)
True 80
100.0%
2023-12-13T05:01:56.085651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:01:56.151020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명승강장번호상하행지상구분역층승강장연결 여부
역명1.0000.0000.0001.0001.0001.000
승강장번호0.0001.0000.9990.0000.0000.000
상하행0.0000.9991.0000.0000.0000.000
지상구분1.0000.0000.0001.0000.0000.143
역층1.0000.0000.0000.0001.0000.253
승강장연결 여부1.0000.0000.0000.1430.2531.000
2023-12-13T05:01:56.269291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승강장연결 여부역층승강장번호상하행지상구분
승강장연결 여부1.0000.3020.0000.0000.090
역층0.3021.0000.0000.0000.000
승강장번호0.0000.0001.0000.9750.000
상하행0.0000.0000.9751.0000.000
지상구분0.0900.0000.0000.0001.000
2023-12-13T05:01:56.386788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승강장번호상하행지상구분역층승강장연결 여부
승강장번호1.0000.9750.0000.0000.000
상하행0.9751.0000.0000.0000.000
지상구분0.0000.0001.0000.0000.090
역층0.0000.0000.0001.0000.302
승강장연결 여부0.0000.0000.0900.3021.000

Missing values

2023-12-13T05:01:53.896005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:01:54.027348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명승강장번호상하행지상구분역층승강장연결 여부스크린도어 유무안전발판 유무
0부산교통공사1호선괴정1상행지하2YYY
1부산교통공사1호선괴정2하행지하2YYY
2부산교통공사1호선교대2하행지하2YYY
3부산교통공사1호선교대1상행지하2YYY
4부산교통공사1호선구서2하행지상2NYY
5부산교통공사1호선구서1상행지상2NYY
6부산교통공사1호선남산(부산외국대학교)2하행지하2NYY
7부산교통공사1호선남산(부산외국대학교)1상행지하2NYY
8부산교통공사1호선남포1상행지하2NYY
9부산교통공사1호선남포2하행지하2NYY
철도운영기관명선명역명승강장번호상하행지상구분역층승강장연결 여부스크린도어 유무안전발판 유무
70부산교통공사1호선다대포항2하행지하2NYY
71부산교통공사1호선다대포항1상행지하2NYY
72부산교통공사1호선다대포해수욕장2하행지하2YYY
73부산교통공사1호선다대포해수욕장1상행지하2YYY
74부산교통공사1호선동매1상행지하2NYY
75부산교통공사1호선동매2하행지하2NYY
76부산교통공사1호선신장림1상행지하2NYY
77부산교통공사1호선신장림2하행지하2NYY
78부산교통공사1호선장림2하행지하2NYY
79부산교통공사1호선장림1상행지하2NYY