Overview

Dataset statistics

Number of variables5
Number of observations940
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)0.3%
Total size in memory36.8 KiB
Average record size in memory40.1 B

Variable types

Categorical4
Text1

Dataset

Description수도권4호선에 포함된 도시광역철도역들의 철도운영기관명,선명,역명,출구번호,출구별 주요시설명, 주소 등의 데이터 입니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15073462/fileData.do

Alerts

선명 has constant value ""Constant
Dataset has 3 (0.3%) duplicate rowsDuplicates
역명 is highly overall correlated with 철도운영기관명High correlation
철도운영기관명 is highly overall correlated with 역명High correlation

Reproduction

Analysis started2023-12-12 20:11:16.868451
Analysis finished2023-12-12 20:11:17.696664
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
서울교통공사
726 
코레일
214 

Length

Max length6
Median length6
Mean length5.3170213
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 726
77.2%
코레일 214
 
22.8%

Length

2023-12-13T05:11:17.772826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:11:17.880456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 726
77.2%
코레일 214
 
22.8%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
4호선
940 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4호선
2nd row4호선
3rd row4호선
4th row4호선
5th row4호선

Common Values

ValueCountFrequency (%)
4호선 940
100.0%

Length

2023-12-13T05:11:17.976606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:11:18.066078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4호선 940
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct46
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
사당
 
68
서울역
 
59
미아사거리
 
43
충무로
 
42
명동
 
37
Other values (41)
691 

Length

Max length11
Median length10
Mean length4.0797872
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row당고개
2nd row당고개
3rd row당고개
4th row당고개
5th row당고개

Common Values

ValueCountFrequency (%)
사당 68
 
7.2%
서울역 59
 
6.3%
미아사거리 43
 
4.6%
충무로 42
 
4.5%
명동 37
 
3.9%
동대문 37
 
3.9%
동대문역사문화공원 34
 
3.6%
미아 33
 
3.5%
창동 33
 
3.5%
숙대입구(갈월) 33
 
3.5%
Other values (36) 521
55.4%

Length

2023-12-13T05:11:18.180901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사당 68
 
7.2%
서울역 59
 
6.3%
미아사거리 43
 
4.6%
충무로 42
 
4.5%
명동 37
 
3.9%
동대문 37
 
3.9%
동대문역사문화공원 34
 
3.6%
미아 33
 
3.5%
창동 33
 
3.5%
숙대입구(갈월 33
 
3.5%
Other values (36) 521
55.4%

출구번호
Categorical

Distinct18
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
1
194 
2
146 
4
122 
3
109 
6
83 
Other values (13)
286 

Length

Max length3
Median length1
Mean length1.1117021
Min length1

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 194
20.6%
2 146
15.5%
4 122
13.0%
3 109
11.6%
6 83
8.8%
5 70
 
7.4%
7 56
 
6.0%
8 48
 
5.1%
10 25
 
2.7%
13 18
 
1.9%
Other values (8) 69
 
7.3%

Length

2023-12-13T05:11:18.324784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 194
20.6%
2 146
15.5%
4 122
13.0%
3 109
11.6%
6 83
8.8%
5 70
 
7.4%
7 56
 
6.0%
8 48
 
5.1%
10 25
 
2.7%
9 18
 
1.9%
Other values (8) 69
 
7.3%
Distinct806
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
2023-12-13T05:11:18.616729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length6.212766
Min length2

Characters and Unicode

Total characters5840
Distinct characters387
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique700 ?
Unique (%)74.5%

Sample

1st row상계4동
2nd row상계3동주민센터
3rd row상계4동사무소
4th row상계3동우체국
5th row상계종합사회복지관
ValueCountFrequency (%)
고교 10
 
0.9%
아파트 7
 
0.6%
주민센터 6
 
0.6%
방면 6
 
0.6%
파출소 5
 
0.5%
국민은행 5
 
0.5%
우리은행 5
 
0.5%
남대문시장 4
 
0.4%
고등학교 4
 
0.4%
우체국 4
 
0.4%
Other values (869) 1034
94.9%
2023-12-13T05:11:19.081673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
181
 
3.1%
159
 
2.7%
157
 
2.7%
151
 
2.6%
108
 
1.8%
105
 
1.8%
101
 
1.7%
100
 
1.7%
99
 
1.7%
93
 
1.6%
Other values (377) 4586
78.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5478
93.8%
Space Separator 151
 
2.6%
Decimal Number 99
 
1.7%
Uppercase Letter 52
 
0.9%
Other Punctuation 40
 
0.7%
Open Punctuation 6
 
0.1%
Close Punctuation 6
 
0.1%
Lowercase Letter 3
 
0.1%
Dash Punctuation 2
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
181
 
3.3%
159
 
2.9%
157
 
2.9%
108
 
2.0%
105
 
1.9%
101
 
1.8%
100
 
1.8%
99
 
1.8%
93
 
1.7%
91
 
1.7%
Other values (341) 4284
78.2%
Uppercase Letter
ValueCountFrequency (%)
S 10
19.2%
K 7
13.5%
G 7
13.5%
C 4
 
7.7%
T 4
 
7.7%
O 3
 
5.8%
E 2
 
3.8%
B 2
 
3.8%
V 2
 
3.8%
N 2
 
3.8%
Other values (8) 9
17.3%
Decimal Number
ValueCountFrequency (%)
1 31
31.3%
3 20
20.2%
2 15
15.2%
4 13
13.1%
5 10
 
10.1%
9 6
 
6.1%
6 2
 
2.0%
8 1
 
1.0%
7 1
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
o 2
66.7%
d 1
33.3%
Space Separator
ValueCountFrequency (%)
151
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 40
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5479
93.8%
Common 306
 
5.2%
Latin 55
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
181
 
3.3%
159
 
2.9%
157
 
2.9%
108
 
2.0%
105
 
1.9%
101
 
1.8%
100
 
1.8%
99
 
1.8%
93
 
1.7%
91
 
1.7%
Other values (342) 4285
78.2%
Latin
ValueCountFrequency (%)
S 10
18.2%
K 7
12.7%
G 7
12.7%
C 4
 
7.3%
T 4
 
7.3%
O 3
 
5.5%
E 2
 
3.6%
B 2
 
3.6%
V 2
 
3.6%
N 2
 
3.6%
Other values (10) 12
21.8%
Common
ValueCountFrequency (%)
151
49.3%
/ 40
 
13.1%
1 31
 
10.1%
3 20
 
6.5%
2 15
 
4.9%
4 13
 
4.2%
5 10
 
3.3%
9 6
 
2.0%
( 6
 
2.0%
) 6
 
2.0%
Other values (5) 8
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5478
93.8%
ASCII 361
 
6.2%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
181
 
3.3%
159
 
2.9%
157
 
2.9%
108
 
2.0%
105
 
1.9%
101
 
1.8%
100
 
1.8%
99
 
1.8%
93
 
1.7%
91
 
1.7%
Other values (341) 4284
78.2%
ASCII
ValueCountFrequency (%)
151
41.8%
/ 40
 
11.1%
1 31
 
8.6%
3 20
 
5.5%
2 15
 
4.2%
4 13
 
3.6%
S 10
 
2.8%
5 10
 
2.8%
K 7
 
1.9%
G 7
 
1.9%
Other values (25) 57
 
15.8%
None
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-13T05:11:19.180451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명출구번호
철도운영기관명1.0001.0000.247
역명1.0001.0000.645
출구번호0.2470.6451.000
2023-12-13T05:11:19.271852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출구번호역명철도운영기관명
출구번호1.0000.2150.193
역명0.2151.0000.976
철도운영기관명0.1930.9761.000
2023-12-13T05:11:19.379683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명출구번호
철도운영기관명1.0000.9760.193
역명0.9761.0000.215
출구번호0.1930.2151.000

Missing values

2023-12-13T05:11:17.523067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:11:17.656214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출구번호출구별 주요시설명
0서울교통공사4호선당고개1상계4동
1서울교통공사4호선당고개1상계3동주민센터
2서울교통공사4호선당고개1상계4동사무소
3서울교통공사4호선당고개2상계3동우체국
4서울교통공사4호선당고개2상계종합사회복지관
5서울교통공사4호선당고개2기업은행
6서울교통공사4호선당고개3신상계 초등학교
7서울교통공사4호선당고개4상계4동 파출소
8서울교통공사4호선당고개4삼락시장
9서울교통공사4호선상계1중계4동사무소
철도운영기관명선명역명출구번호출구별 주요시설명
930코레일4호선정왕2시흥교육지원청
931코레일4호선오이도1송운중학교
932코레일4호선오이도1송운초등학교
933코레일4호선오이도1냉정초등학교
934코레일4호선오이도2함현중학교
935코레일4호선오이도2함현초등학교
936코레일4호선오이도2함현고등학교
937코레일4호선오이도2정왕4동주민센터
938코레일4호선오이도2함현상생종합사회복지관
939코레일4호선오이도3시화자연생태학교

Duplicate rows

Most frequently occurring

철도운영기관명선명역명출구번호출구별 주요시설명# duplicates
0서울교통공사4호선미아4화계초등학교2
1서울교통공사4호선사당1예술의전당방면2
2서울교통공사4호선상계1성원아파트2