Overview

Dataset statistics

Number of variables5
Number of observations5605
Missing cells0
Missing cells (%)0.0%
Duplicate rows31
Duplicate rows (%)0.6%
Total size in memory219.1 KiB
Average record size in memory40.0 B

Variable types

Categorical3
Text2

Dataset

Description서울교통공사에서 관리하는 도시광역철도역들의 철도운영기관명,선명,역명,출구번호,출구별 주요시설명, 주소 등의 데이터 입니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15073460/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
Dataset has 31 (0.6%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 16:52:44.584708
Analysis finished2023-12-12 16:52:45.817305
Duration1.23 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
서울교통공사
5605 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 5605
100.0%

Length

2023-12-13T01:52:45.929373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:52:46.104425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 5605
100.0%

선명
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
2호선
1549 
3호선
921 
7호선
817 
4호선
726 
5호선
546 
Other values (3)
1046 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
2호선 1549
27.6%
3호선 921
16.4%
7호선 817
14.6%
4호선 726
13.0%
5호선 546
 
9.7%
1호선 477
 
8.5%
6호선 386
 
6.9%
8호선 183
 
3.3%

Length

2023-12-13T01:52:46.259295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:52:46.412113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 1549
27.6%
3호선 921
16.4%
7호선 817
14.6%
4호선 726
13.0%
5호선 546
 
9.7%
1호선 477
 
8.5%
6호선 386
 
6.9%
8호선 183
 
3.3%

역명
Text

Distinct216
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
2023-12-13T01:52:46.777968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.3678858
Min length2

Characters and Unicode

Total characters24482
Distinct characters225
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청량리(서울시립대입구)
2nd row청량리(서울시립대입구)
3rd row청량리(서울시립대입구)
4th row청량리(서울시립대입구)
5th row청량리(서울시립대입구)
ValueCountFrequency (%)
사당 136
 
2.4%
시청 119
 
2.1%
서울역 118
 
2.1%
대림(구로구청 104
 
1.9%
충무로 84
 
1.5%
경복궁(정부서울청사 81
 
1.4%
안국 73
 
1.3%
서울대입구(관악구청 69
 
1.2%
동대문역사문화공원 68
 
1.2%
건대입구 66
 
1.2%
Other values (206) 4687
83.6%
2023-12-13T01:52:47.287734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 1237
 
5.1%
( 1237
 
5.1%
1193
 
4.9%
1132
 
4.6%
740
 
3.0%
615
 
2.5%
553
 
2.3%
542
 
2.2%
509
 
2.1%
481
 
2.0%
Other values (215) 16243
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21773
88.9%
Close Punctuation 1237
 
5.1%
Open Punctuation 1237
 
5.1%
Decimal Number 170
 
0.7%
Other Punctuation 65
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1193
 
5.5%
1132
 
5.2%
740
 
3.4%
615
 
2.8%
553
 
2.5%
542
 
2.5%
509
 
2.3%
481
 
2.2%
467
 
2.1%
418
 
1.9%
Other values (209) 15123
69.5%
Decimal Number
ValueCountFrequency (%)
3 109
64.1%
5 46
27.1%
4 15
 
8.8%
Close Punctuation
ValueCountFrequency (%)
) 1237
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1237
100.0%
Other Punctuation
ValueCountFrequency (%)
· 65
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21773
88.9%
Common 2709
 
11.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1193
 
5.5%
1132
 
5.2%
740
 
3.4%
615
 
2.8%
553
 
2.5%
542
 
2.5%
509
 
2.3%
481
 
2.2%
467
 
2.1%
418
 
1.9%
Other values (209) 15123
69.5%
Common
ValueCountFrequency (%)
) 1237
45.7%
( 1237
45.7%
3 109
 
4.0%
· 65
 
2.4%
5 46
 
1.7%
4 15
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21773
88.9%
ASCII 2644
 
10.8%
None 65
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 1237
46.8%
( 1237
46.8%
3 109
 
4.1%
5 46
 
1.7%
4 15
 
0.6%
Hangul
ValueCountFrequency (%)
1193
 
5.5%
1132
 
5.2%
740
 
3.4%
615
 
2.8%
553
 
2.5%
542
 
2.5%
509
 
2.3%
481
 
2.2%
467
 
2.1%
418
 
1.9%
Other values (209) 15123
69.5%
None
ValueCountFrequency (%)
· 65
100.0%

출구번호
Categorical

Distinct24
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
1
1019 
3
857 
4
808 
2
806 
5
495 
Other values (19)
1620 

Length

Max length3
Median length1
Mean length1.0790366
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 1019
18.2%
3 857
15.3%
4 808
14.4%
2 806
14.4%
5 495
8.8%
6 446
8.0%
7 343
 
6.1%
8 279
 
5.0%
9 141
 
2.5%
10 133
 
2.4%
Other values (14) 278
 
5.0%

Length

2023-12-13T01:52:47.472345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 1019
18.2%
3 857
15.3%
4 808
14.4%
2 806
14.4%
5 495
8.8%
6 446
8.0%
7 343
 
6.1%
8 279
 
5.0%
9 141
 
2.5%
10 133
 
2.4%
Other values (14) 278
 
5.0%
Distinct4158
Distinct (%)74.2%
Missing0
Missing (%)0.0%
Memory size43.9 KiB
2023-12-13T01:52:47.773915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length19
Mean length6.344157
Min length2

Characters and Unicode

Total characters35559
Distinct characters593
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3280 ?
Unique (%)58.5%

Sample

1st row경동시장
2nd row동서시장
3rd row청량리우체국
4th row동대문경찰서
5th row현대코아빌딩
ValueCountFrequency (%)
방면 51
 
0.8%
고교 35
 
0.6%
고등학교 26
 
0.4%
국민은행 25
 
0.4%
우리은행 25
 
0.4%
주민센터 23
 
0.4%
아파트 23
 
0.4%
현대아파트 22
 
0.4%
외환은행 19
 
0.3%
우체국 19
 
0.3%
Other values (4297) 5974
95.7%
2023-12-13T01:52:48.272859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1164
 
3.3%
1155
 
3.2%
984
 
2.8%
757
 
2.1%
756
 
2.1%
708
 
2.0%
661
 
1.9%
648
 
1.8%
641
 
1.8%
641
 
1.8%
Other values (583) 27444
77.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33460
94.1%
Decimal Number 721
 
2.0%
Space Separator 641
 
1.8%
Uppercase Letter 259
 
0.7%
Other Punctuation 186
 
0.5%
Open Punctuation 131
 
0.4%
Close Punctuation 131
 
0.4%
Other Symbol 8
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Math Symbol 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1164
 
3.5%
1155
 
3.5%
984
 
2.9%
757
 
2.3%
756
 
2.3%
708
 
2.1%
661
 
2.0%
648
 
1.9%
641
 
1.9%
620
 
1.9%
Other values (534) 25366
75.8%
Uppercase Letter
ValueCountFrequency (%)
S 39
15.1%
K 38
14.7%
G 25
9.7%
C 24
9.3%
T 21
 
8.1%
L 16
 
6.2%
I 12
 
4.6%
B 11
 
4.2%
Y 9
 
3.5%
A 9
 
3.5%
Other values (13) 55
21.2%
Decimal Number
ValueCountFrequency (%)
1 243
33.7%
2 188
26.1%
3 114
15.8%
4 59
 
8.2%
5 39
 
5.4%
7 21
 
2.9%
6 21
 
2.9%
9 18
 
2.5%
0 10
 
1.4%
8 8
 
1.1%
Lowercase Letter
ValueCountFrequency (%)
o 2
28.6%
k 1
14.3%
r 1
14.3%
a 1
14.3%
p 1
14.3%
d 1
14.3%
Other Punctuation
ValueCountFrequency (%)
/ 177
95.2%
. 4
 
2.2%
· 4
 
2.2%
& 1
 
0.5%
Space Separator
ValueCountFrequency (%)
641
100.0%
Open Punctuation
ValueCountFrequency (%)
( 131
100.0%
Close Punctuation
ValueCountFrequency (%)
) 131
100.0%
Other Symbol
ValueCountFrequency (%)
8
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33468
94.1%
Common 1825
 
5.1%
Latin 266
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1164
 
3.5%
1155
 
3.5%
984
 
2.9%
757
 
2.3%
756
 
2.3%
708
 
2.1%
661
 
2.0%
648
 
1.9%
641
 
1.9%
620
 
1.9%
Other values (535) 25374
75.8%
Latin
ValueCountFrequency (%)
S 39
14.7%
K 38
14.3%
G 25
9.4%
C 24
 
9.0%
T 21
 
7.9%
L 16
 
6.0%
I 12
 
4.5%
B 11
 
4.1%
Y 9
 
3.4%
A 9
 
3.4%
Other values (19) 62
23.3%
Common
ValueCountFrequency (%)
641
35.1%
1 243
 
13.3%
2 188
 
10.3%
/ 177
 
9.7%
( 131
 
7.2%
) 131
 
7.2%
3 114
 
6.2%
4 59
 
3.2%
5 39
 
2.1%
7 21
 
1.2%
Other values (9) 81
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33460
94.1%
ASCII 2087
 
5.9%
None 12
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1164
 
3.5%
1155
 
3.5%
984
 
2.9%
757
 
2.3%
756
 
2.3%
708
 
2.1%
661
 
2.0%
648
 
1.9%
641
 
1.9%
620
 
1.9%
Other values (534) 25366
75.8%
ASCII
ValueCountFrequency (%)
641
30.7%
1 243
 
11.6%
2 188
 
9.0%
/ 177
 
8.5%
( 131
 
6.3%
) 131
 
6.3%
3 114
 
5.5%
4 59
 
2.8%
5 39
 
1.9%
S 39
 
1.9%
Other values (37) 325
15.6%
None
ValueCountFrequency (%)
8
66.7%
· 4
33.3%

Correlations

2023-12-13T01:52:48.364382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명출구번호
선명1.0000.320
출구번호0.3201.000
2023-12-13T01:52:48.483736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출구번호선명
출구번호1.0000.118
선명0.1181.000
2023-12-13T01:52:48.595877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명출구번호
선명1.0000.118
출구번호0.1181.000

Missing values

2023-12-13T01:52:45.634284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:52:45.753303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출구번호출구별 주요시설명
0서울교통공사1호선청량리(서울시립대입구)1경동시장
1서울교통공사1호선청량리(서울시립대입구)1동서시장
2서울교통공사1호선청량리(서울시립대입구)2청량리우체국
3서울교통공사1호선청량리(서울시립대입구)2동대문경찰서
4서울교통공사1호선청량리(서울시립대입구)2현대코아빌딩
5서울교통공사1호선청량리(서울시립대입구)3동대문세무서
6서울교통공사1호선청량리(서울시립대입구)3미주아파트
7서울교통공사1호선청량리(서울시립대입구)4서울시립대학교
8서울교통공사1호선청량리(서울시립대입구)4청량리역(국철)
9서울교통공사1호선청량리(서울시립대입구)5역전치안센터
철도운영기관명선명역명출구번호출구별 주요시설명
5595서울교통공사8호선모란6모란시장
5596서울교통공사8호선모란7성남IC방면
5597서울교통공사8호선모란8탄천방면
5598서울교통공사8호선모란9성수초등학교
5599서울교통공사8호선모란10근로복지공단 성남지사
5600서울교통공사8호선모란11수진동우체국
5601서울교통공사8호선모란11성남소방서
5602서울교통공사8호선모란11풍생중/ 고등학교
5603서울교통공사8호선모란12중앙로
5604서울교통공사8호선모란12성남종합운동장

Duplicate rows

Most frequently occurring

철도운영기관명선명역명출구번호출구별 주요시설명# duplicates
0서울교통공사1호선시청4서울글로벌센터2
1서울교통공사2호선강남10교보타워2
2서울교통공사2호선강남10한남대교방면2
3서울교통공사2호선강변(동서울터미널)4서울광진학교2
4서울교통공사2호선건대입구5동자초등학교2
5서울교통공사2호선건대입구5신양초등학교2
6서울교통공사2호선방배2경남아파트2
7서울교통공사2호선사당1예술의전당방면2
8서울교통공사2호선잠실나루2진주아파트2
9서울교통공사2호선잠실나루3송파구청2