Overview

Dataset statistics

Number of variables7
Number of observations73
Missing cells11
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory58.8 B

Variable types

Categorical4
Text3

Dataset

Description수도권1호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 지상지하, 역층, 상세위치, 전화번호 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041414/fileData.do

Alerts

선명 has constant value ""Constant
철도운영기관명 is highly overall correlated with 지상지하High correlation
지상지하 is highly overall correlated with 철도운영기관명High correlation
상세위치 has 10 (13.7%) missing valuesMissing
전화번호 has 1 (1.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 03:57:34.087246
Analysis finished2023-12-12 03:57:34.889531
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
코레일
63 
서울교통공사
10 

Length

Max length6
Median length3
Mean length3.4109589
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
코레일 63
86.3%
서울교통공사 10
 
13.7%

Length

2023-12-12T12:57:34.981857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:57:35.108872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 63
86.3%
서울교통공사 10
 
13.7%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size716.0 B
1호선
73 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 73
100.0%

Length

2023-12-12T12:57:35.232928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:57:35.358478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 73
100.0%

역명
Text

Distinct66
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Memory size716.0 B
2023-12-12T12:57:35.626419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length2.6164384
Min length2

Characters and Unicode

Total characters191
Distinct characters97
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)80.8%

Sample

1st row서울역
2nd row시청
3rd row종각
4th row종로5가
5th row신설동
ValueCountFrequency (%)
구일 2
 
2.7%
월계 2
 
2.7%
의정부 2
 
2.7%
주안 2
 
2.7%
부평 2
 
2.7%
지행 2
 
2.7%
도봉 2
 
2.7%
성균관대 1
 
1.4%
안양 1
 
1.4%
아산 1
 
1.4%
Other values (56) 56
76.7%
2023-12-12T12:57:36.010597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
 
4.7%
8
 
4.2%
6
 
3.1%
6
 
3.1%
6
 
3.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (87) 136
71.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 185
96.9%
Close Punctuation 2
 
1.0%
Open Punctuation 2
 
1.0%
Decimal Number 2
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
4.9%
8
 
4.3%
6
 
3.2%
6
 
3.2%
6
 
3.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (83) 130
70.3%
Decimal Number
ValueCountFrequency (%)
3 1
50.0%
5 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 185
96.9%
Common 6
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
4.9%
8
 
4.3%
6
 
3.2%
6
 
3.2%
6
 
3.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (83) 130
70.3%
Common
ValueCountFrequency (%)
) 2
33.3%
( 2
33.3%
3 1
16.7%
5 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 185
96.9%
ASCII 6
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9
 
4.9%
8
 
4.3%
6
 
3.2%
6
 
3.2%
6
 
3.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (83) 130
70.3%
ASCII
ValueCountFrequency (%)
) 2
33.3%
( 2
33.3%
3 1
16.7%
5 1
16.7%

지상지하
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
지상
60 
지하
13 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지상 60
82.2%
지하 13
 
17.8%

Length

2023-12-12T12:57:36.139384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:57:36.231109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상 60
82.2%
지하 13
 
17.8%

역층
Categorical

Distinct3
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size716.0 B
1
31 
2
31 
3
11 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 31
42.5%
2 31
42.5%
3 11
 
15.1%

Length

2023-12-12T12:57:36.332810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:57:36.441121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 31
42.5%
2 31
42.5%
3 11
 
15.1%

상세위치
Text

MISSING 

Distinct60
Distinct (%)95.2%
Missing10
Missing (%)13.7%
Memory size716.0 B
2023-12-12T12:57:36.632042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length18
Mean length14.206349
Min length6

Characters and Unicode

Total characters895
Distinct characters81
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)90.5%

Sample

1st row(1F) 표내는 곳 옆
2nd row(F1) 대합실 12번 출입구 방향
3rd row(2F) 맞이방 개집표구 옆
4th row2층 맞이방 북쪽게이트 옆 역무실
5th row(2F) 역사 대합실
ValueCountFrequency (%)
맞이방 32
 
12.2%
22
 
8.4%
방향 16
 
6.1%
1f 14
 
5.3%
역무실 12
 
4.6%
출입구 12
 
4.6%
2f 11
 
4.2%
3f 8
 
3.0%
8
 
3.0%
게이트 7
 
2.7%
Other values (62) 121
46.0%
2023-12-12T12:57:37.284903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
200
22.3%
52
 
5.8%
( 49
 
5.5%
) 49
 
5.5%
41
 
4.6%
F 37
 
4.1%
2 34
 
3.8%
33
 
3.7%
1 27
 
3.0%
23
 
2.6%
Other values (71) 350
39.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 481
53.7%
Space Separator 200
22.3%
Decimal Number 76
 
8.5%
Open Punctuation 49
 
5.5%
Close Punctuation 49
 
5.5%
Uppercase Letter 37
 
4.1%
Math Symbol 2
 
0.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
52
 
10.8%
41
 
8.5%
33
 
6.9%
23
 
4.8%
23
 
4.8%
21
 
4.4%
20
 
4.2%
19
 
4.0%
18
 
3.7%
18
 
3.7%
Other values (60) 213
44.3%
Decimal Number
ValueCountFrequency (%)
2 34
44.7%
1 27
35.5%
3 14
18.4%
7 1
 
1.3%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
200
100.0%
Open Punctuation
ValueCountFrequency (%)
( 49
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Uppercase Letter
ValueCountFrequency (%)
F 37
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 481
53.7%
Common 377
42.1%
Latin 37
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
52
 
10.8%
41
 
8.5%
33
 
6.9%
23
 
4.8%
23
 
4.8%
21
 
4.4%
20
 
4.2%
19
 
4.0%
18
 
3.7%
18
 
3.7%
Other values (60) 213
44.3%
Common
ValueCountFrequency (%)
200
53.1%
( 49
 
13.0%
) 49
 
13.0%
2 34
 
9.0%
1 27
 
7.2%
3 14
 
3.7%
> 1
 
0.3%
7 1
 
0.3%
- 1
 
0.3%
1
 
0.3%
Latin
ValueCountFrequency (%)
F 37
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 481
53.7%
ASCII 413
46.1%
Arrows 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
200
48.4%
( 49
 
11.9%
) 49
 
11.9%
F 37
 
9.0%
2 34
 
8.2%
1 27
 
6.5%
3 14
 
3.4%
> 1
 
0.2%
7 1
 
0.2%
- 1
 
0.2%
Hangul
ValueCountFrequency (%)
52
 
10.8%
41
 
8.5%
33
 
6.9%
23
 
4.8%
23
 
4.8%
21
 
4.4%
20
 
4.2%
19
 
4.0%
18
 
3.7%
18
 
3.7%
Other values (60) 213
44.3%
Arrows
ValueCountFrequency (%)
1
100.0%

전화번호
Text

MISSING 

Distinct69
Distinct (%)95.8%
Missing1
Missing (%)1.4%
Memory size716.0 B
2023-12-12T12:57:37.529224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.875
Min length11

Characters and Unicode

Total characters855
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)91.7%

Sample

1st row02-6110-1331
2nd row02-6110-1321
3rd row02-6110-1311
4th row02-6110-1291
5th row02-6110-1261
ValueCountFrequency (%)
032-865-7787 2
 
2.8%
032-528-1439 2
 
2.8%
031-862-2788 2
 
2.8%
02-965-1467 1
 
1.4%
031-872-7744 1
 
1.4%
02-2639-3434 1
 
1.4%
02-2639-3242 1
 
1.4%
031-841-7787 1
 
1.4%
031-448-7788 1
 
1.4%
042-532-6610 1
 
1.4%
Other values (59) 59
81.9%
2023-12-12T12:57:37.891049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 144
16.8%
0 95
11.1%
2 95
11.1%
8 91
10.6%
1 88
10.3%
7 85
9.9%
3 74
8.7%
6 61
7.1%
4 48
 
5.6%
5 37
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 711
83.2%
Dash Punctuation 144
 
16.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 95
13.4%
2 95
13.4%
8 91
12.8%
1 88
12.4%
7 85
12.0%
3 74
10.4%
6 61
8.6%
4 48
6.8%
5 37
 
5.2%
9 37
 
5.2%
Dash Punctuation
ValueCountFrequency (%)
- 144
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 855
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 144
16.8%
0 95
11.1%
2 95
11.1%
8 91
10.6%
1 88
10.3%
7 85
9.9%
3 74
8.7%
6 61
7.1%
4 48
 
5.6%
5 37
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 855
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 144
16.8%
0 95
11.1%
2 95
11.1%
8 91
10.6%
1 88
10.3%
7 85
9.9%
3 74
8.7%
6 61
7.1%
4 48
 
5.6%
5 37
 
4.3%

Correlations

2023-12-12T12:57:37.992266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명지상지하역층상세위치전화번호
철도운영기관명1.0001.0000.8870.215NaN1.000
역명1.0001.0000.4770.9370.9901.000
지상지하0.8870.4771.0000.2260.0000.000
역층0.2150.9370.2261.0000.9680.971
상세위치NaN0.9900.0000.9681.0000.997
전화번호1.0001.0000.0000.9710.9971.000
2023-12-12T12:57:38.099207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역층지상지하
철도운영기관명1.0000.3490.695
역층0.3491.0000.367
지상지하0.6950.3671.000
2023-12-12T12:57:38.196491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명지상지하역층
철도운영기관명1.0000.6950.349
지상지하0.6951.0000.367
역층0.3490.3671.000

Missing values

2023-12-12T12:57:34.610514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:57:34.738521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:57:34.837393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

철도운영기관명선명역명지상지하역층상세위치전화번호
0서울교통공사1호선서울역지하1<NA>02-6110-1331
1서울교통공사1호선시청지하1<NA>02-6110-1321
2서울교통공사1호선종각지하1<NA>02-6110-1311
3서울교통공사1호선종로5가지하1<NA>02-6110-1291
4서울교통공사1호선신설동지하1<NA>02-6110-1261
5서울교통공사1호선제기동지하1<NA>02-6110-1251
6서울교통공사1호선청량리(서울시립대입구)지하2<NA>02-6110-1241
7서울교통공사1호선동묘앞지상1<NA>02-6110-1271
8서울교통공사1호선종로3가지하1<NA>02-6110-1301
9서울교통공사1호선동대문지하1<NA>02-6110-1281
철도운영기관명선명역명지상지하역층상세위치전화번호
63코레일1호선주안지하1표 내는 곳 옆032-865-7787
64코레일1호선지행지상1(1F) 남부 맞이방 내031-862-2788
65코레일1호선지행지상1(1F) 북부 맞이방 내031-862-2788
66코레일1호선천안지상3(3F) 동부맞이방 매표소 옆041-562-7034
67코레일1호선탕정지상1(1) 대합실041-548-6788
68코레일1호선평택지상3(3F) 맞이방 수유실옆031-652-0245
69코레일1호선회룡지상3(3F) 대합실 3번 출입구 방향031-872-7744
70코레일1호선회기지상2(2층) 게이트 입구 좌측(역무실)02-965-1467
71코레일1호선화서지상2(2층) 맞이방 출입문 근처 게이트 옆031-242-7788
72코레일1호선직산지상2맞이방 역무실041-583-7788