Overview

Dataset statistics

Number of variables10
Number of observations26
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory89.1 B

Variable types

Categorical8
Text2

Dataset

Description수도권6호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 휠체어리프트의 관리번호, 출입구번호, 상세위치, 길이, 폭, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041433/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
길이 has constant value ""Constant
is highly overall correlated with 출입구번호High correlation
종료층 is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
시작층 is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
출입구번호 is highly overall correlated with 휠체어리프트의 관리번호 and 3 other fieldsHigh correlation
휠체어리프트의 관리번호 is highly overall correlated with 출입구번호High correlation
출입구번호 is highly imbalanced (55.4%)Imbalance
is highly imbalanced (76.5%)Imbalance

Reproduction

Analysis started2023-12-12 15:02:28.586846
Analysis finished2023-12-12 15:02:29.493614
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
서울교통공사
26 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 26
100.0%

Length

2023-12-13T00:02:29.566852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:29.656621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 26
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
6호선
26 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6호선
2nd row6호선
3rd row6호선
4th row6호선
5th row6호선

Common Values

ValueCountFrequency (%)
6호선 26
100.0%

Length

2023-12-13T00:02:29.763077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:29.853693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
6호선 26
100.0%

역명
Text

Distinct14
Distinct (%)53.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-13T00:02:29.981080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length10
Mean length5.1923077
Min length2

Characters and Unicode

Total characters135
Distinct characters50
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)23.1%

Sample

1st row불광
2nd row불광
3rd row구산
4th row새절(신사)
5th row디지털미디어시티
ValueCountFrequency (%)
신당 5
19.2%
디지털미디어시티 3
11.5%
불광 2
 
7.7%
삼각지 2
 
7.7%
상월곡(한국과학기술연구원 2
 
7.7%
석계 2
 
7.7%
봉화산(서울의료원 2
 
7.7%
동묘앞 2
 
7.7%
구산 1
 
3.8%
새절(신사 1
 
3.8%
Other values (4) 4
15.4%
2023-12-13T00:02:30.301919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 8
 
5.9%
) 8
 
5.9%
6
 
4.4%
6
 
4.4%
5
 
3.7%
5
 
3.7%
4
 
3.0%
4
 
3.0%
3
 
2.2%
3
 
2.2%
Other values (40) 83
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 119
88.1%
Open Punctuation 8
 
5.9%
Close Punctuation 8
 
5.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
5.0%
6
 
5.0%
5
 
4.2%
5
 
4.2%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (38) 77
64.7%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 119
88.1%
Common 16
 
11.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
5.0%
6
 
5.0%
5
 
4.2%
5
 
4.2%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (38) 77
64.7%
Common
ValueCountFrequency (%)
( 8
50.0%
) 8
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 119
88.1%
ASCII 16
 
11.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 8
50.0%
) 8
50.0%
Hangul
ValueCountFrequency (%)
6
 
5.0%
6
 
5.0%
5
 
4.2%
5
 
4.2%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (38) 77
64.7%

휠체어리프트의 관리번호
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Memory size340.0 B
1
14 
2
3
4
 
1
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)7.7%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 14
53.8%
2 8
30.8%
3 2
 
7.7%
4 1
 
3.8%
5 1
 
3.8%

Length

2023-12-13T00:02:30.418499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:30.510688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 14
53.8%
2 8
30.8%
3 2
 
7.7%
4 1
 
3.8%
5 1
 
3.8%

출입구번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Memory size340.0 B
<NA>
21 
3
 
1
2
 
1
7
 
1
9
 
1

Length

Max length4
Median length4
Mean length3.4230769
Min length1

Unique

Unique5 ?
Unique (%)19.2%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 21
80.8%
3 1
 
3.8%
2 1
 
3.8%
7 1
 
3.8%
9 1
 
3.8%
5 1
 
3.8%

Length

2023-12-13T00:02:30.613059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:30.729373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 21
80.8%
3 1
 
3.8%
2 1
 
3.8%
7 1
 
3.8%
9 1
 
3.8%
5 1
 
3.8%
Distinct25
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-13T00:02:30.892410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length22
Mean length20.769231
Min length14

Characters and Unicode

Total characters540
Distinct characters56
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)92.3%

Sample

1st row(B1)종점3환승통로(B4-B1)-종3가측
2nd row(B1)시점3환승통로(B2-B1)-연신내측
3rd row(B2)승-대(B4-B1)
4th row(B1)하선 승/시-대(B2-B1)
5th row(B2)공항철도 환승 상선 승/종-대(B2-B1)
ValueCountFrequency (%)
환승 6
 
11.3%
b1)대-대(b2-b1 2
 
3.8%
대(b3-b2 2
 
3.8%
시/승-대(b3-b2 2
 
3.8%
b2)1호선 2
 
3.8%
시/승 2
 
3.8%
2
 
3.8%
b1)2호선 2
 
3.8%
하선 2
 
3.8%
승/종-대(b2-b1 2
 
3.8%
Other values (27) 29
54.7%
2023-12-13T00:02:31.226427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 68
12.6%
( 60
11.1%
) 60
11.1%
- 44
 
8.1%
1 42
 
7.8%
2 30
 
5.6%
27
 
5.0%
25
 
4.6%
23
 
4.3%
3 13
 
2.4%
Other values (46) 148
27.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 170
31.5%
Decimal Number 92
17.0%
Uppercase Letter 77
14.3%
Open Punctuation 60
 
11.1%
Close Punctuation 60
 
11.1%
Dash Punctuation 44
 
8.1%
Space Separator 27
 
5.0%
Other Punctuation 10
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
14.7%
23
 
13.5%
11
 
6.5%
10
 
5.9%
8
 
4.7%
7
 
4.1%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.4%
Other values (33) 67
39.4%
Decimal Number
ValueCountFrequency (%)
1 42
45.7%
2 30
32.6%
3 13
 
14.1%
4 4
 
4.3%
5 2
 
2.2%
9 1
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
B 68
88.3%
F 9
 
11.7%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44
100.0%
Space Separator
ValueCountFrequency (%)
27
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 293
54.3%
Hangul 170
31.5%
Latin 77
 
14.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
14.7%
23
 
13.5%
11
 
6.5%
10
 
5.9%
8
 
4.7%
7
 
4.1%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.4%
Other values (33) 67
39.4%
Common
ValueCountFrequency (%)
( 60
20.5%
) 60
20.5%
- 44
15.0%
1 42
14.3%
2 30
10.2%
27
9.2%
3 13
 
4.4%
/ 10
 
3.4%
4 4
 
1.4%
5 2
 
0.7%
Latin
ValueCountFrequency (%)
B 68
88.3%
F 9
 
11.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 370
68.5%
Hangul 170
31.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 68
18.4%
( 60
16.2%
) 60
16.2%
- 44
11.9%
1 42
11.4%
2 30
8.1%
27
 
7.3%
3 13
 
3.5%
/ 10
 
2.7%
F 9
 
2.4%
Other values (3) 7
 
1.9%
Hangul
ValueCountFrequency (%)
25
 
14.7%
23
 
13.5%
11
 
6.5%
10
 
5.9%
8
 
4.7%
7
 
4.1%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.4%
Other values (33) 67
39.4%

길이
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
125
26 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row125
2nd row125
3rd row125
4th row125
5th row125

Common Values

ValueCountFrequency (%)
125 26
100.0%

Length

2023-12-13T00:02:31.342143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:31.432210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
125 26
100.0%


Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size340.0 B
80
25 
90
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row80
2nd row80
3rd row80
4th row80
5th row80

Common Values

ValueCountFrequency (%)
80 25
96.2%
90 1
 
3.8%

Length

2023-12-13T00:02:31.515162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:31.620831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
80 25
96.2%
90 1
 
3.8%

시작층
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
지하1
12 
지하2
지상1
지하3
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row지하1
2nd row지하1
3rd row지하2
4th row지하1
5th row지하2

Common Values

ValueCountFrequency (%)
지하1 12
46.2%
지하2 8
30.8%
지상1 5
19.2%
지하3 1
 
3.8%

Length

2023-12-13T00:02:31.718349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:31.818086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 12
46.2%
지하2 8
30.8%
지상1 5
19.2%
지하3 1
 
3.8%

종료층
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
지하2
11 
지하3
지하1
지하4
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row지하2
2nd row지하2
3rd row지하3
4th row지하2
5th row지하3

Common Values

ValueCountFrequency (%)
지하2 11
42.3%
지하3 9
34.6%
지하1 5
19.2%
지하4 1
 
3.8%

Length

2023-12-13T00:02:31.991653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:02:32.101055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하2 11
42.3%
지하3 9
34.6%
지하1 5
19.2%
지하4 1
 
3.8%

Correlations

2023-12-13T00:02:32.192592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명휠체어리프트의 관리번호출입구번호상세위치시작층종료층
역명1.0000.0001.0000.8721.0000.4450.645
휠체어리프트의 관리번호0.0001.0001.0001.0000.0000.0000.084
출입구번호1.0001.0001.0001.000NaNNaN1.000
상세위치0.8721.0001.0001.0001.0001.0001.000
1.0000.000NaN1.0001.0000.0000.000
시작층0.4450.000NaN1.0000.0001.0000.936
종료층0.6450.0841.0001.0000.0000.9361.000
2023-12-13T00:02:32.327445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종료층시작층출입구번호휠체어리프트의 관리번호
1.0000.0000.0001.0000.000
종료층0.0001.0000.6621.0000.000
시작층0.0000.6621.0001.0000.000
출입구번호1.0001.0001.0001.0001.000
휠체어리프트의 관리번호0.0000.0000.0001.0001.000
2023-12-13T00:02:32.432022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
휠체어리프트의 관리번호출입구번호시작층종료층
휠체어리프트의 관리번호1.0001.0000.0000.0000.000
출입구번호1.0001.0001.0001.0001.000
0.0001.0001.0000.0000.000
시작층0.0001.0000.0001.0000.662
종료층0.0001.0000.0000.6621.000

Missing values

2023-12-13T00:02:29.274014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:02:29.418329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
0서울교통공사6호선불광1<NA>(B1)종점3환승통로(B4-B1)-종3가측12580지하1지하2
1서울교통공사6호선불광2<NA>(B1)시점3환승통로(B2-B1)-연신내측12580지하1지하2
2서울교통공사6호선구산1<NA>(B2)승-대(B4-B1)12580지하2지하3
3서울교통공사6호선새절(신사)1<NA>(B1)하선 승/시-대(B2-B1)12580지하1지하2
4서울교통공사6호선디지털미디어시티1<NA>(B2)공항철도 환승 상선 승/종-대(B2-B1)12580지하2지하3
5서울교통공사6호선디지털미디어시티2<NA>(B3)공항철도 환승 하선 승/종-대(B2-B1)12580지하3지하3
6서울교통공사6호선디지털미디어시티33(F1)3출입구(종점측)(B1-F1)12580지상1지하2
7서울교통공사6호선합정1<NA>(B2)대합실-대합실 (B1층)12590지하2지하3
8서울교통공사6호선광흥창(서강)1<NA>(B2)승/종-대(B3-B2)12580지하2지하3
9서울교통공사6호선대흥(서강대앞)12(F1)2번 출입구(B1-1F)12580지상1지하1
철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
16서울교통공사6호선신당59(F1)9번출입구(B1-F1)12580지상1지하1
17서울교통공사6호선월곡(동덕여대)1<NA>(B1)대-대(B2-B1)12580지하1지하2
18서울교통공사6호선상월곡(한국과학기술연구원)1<NA>(B2)월곡방향 시/승-대(B3-B2)12580지하2지하3
19서울교통공사6호선상월곡(한국과학기술연구원)2<NA>(B2)돌곶이방향 시/승-대(B3-B2)12580지하2지하3
20서울교통공사6호선석계1<NA>(B2)1호선 환승 상선 시/승 - 대(B3-B2)12580지하2지하3
21서울교통공사6호선석계2<NA>(B2)1호선 환승 하선 시/승 - 대(B3-B2)12580지하2지하3
22서울교통공사6호선봉화산(서울의료원)1<NA>(B1)화랑대방향 종/승-대(B2-B1)12580지하1지하2
23서울교통공사6호선봉화산(서울의료원)2<NA>(B1)신내방향 종/승-대(B2-B1)12580지하1지하2
24서울교통공사6호선동묘앞1<NA>(B1)대-대(B2-B1)12580지하1지하2
25서울교통공사6호선동묘앞25(F1)5출입구(B1-F1)12580지상1지하1