Overview

Dataset statistics

Number of variables8
Number of observations36
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory68.7 B

Variable types

Categorical6
Text2

Dataset

Description대구도시철도공사에서 운영하는 2호선의 역사별 화장실 위치에 대한 데이터로 철도운영기관명, 선명, 역명, 화장실의 역층, 게이트내외, 출구번호, 상세위치 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041242/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
지상지하구분 is highly overall correlated with 출구번호High correlation
출구번호 is highly overall correlated with 지상지하구분High correlation
지상지하구분 is highly imbalanced (81.7%)Imbalance

Reproduction

Analysis started2023-12-12 22:27:17.018234
Analysis finished2023-12-12 22:27:17.772791
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size420.0 B
대구교통공사
36 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구교통공사
2nd row대구교통공사
3rd row대구교통공사
4th row대구교통공사
5th row대구교통공사

Common Values

ValueCountFrequency (%)
대구교통공사 36
100.0%

Length

2023-12-13T07:27:17.836891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:17.930374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구교통공사 36
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size420.0 B
2호선
36 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2호선
2nd row2호선
3rd row2호선
4th row2호선
5th row2호선

Common Values

ValueCountFrequency (%)
2호선 36
100.0%

Length

2023-12-13T07:27:18.018287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:18.103121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 36
100.0%

역명
Text

Distinct31
Distinct (%)86.1%
Missing0
Missing (%)0.0%
Memory size420.0 B
2023-12-13T07:27:18.244691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.6944444
Min length2

Characters and Unicode

Total characters97
Distinct characters63
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)72.2%

Sample

1st row대구역
2nd row중앙로
3rd row감삼
4th row강창
5th row경대병원
ValueCountFrequency (%)
청라언덕 2
 
5.6%
사월 2
 
5.6%
2반월당 2
 
5.6%
정평 2
 
5.6%
대공원 2
 
5.6%
영남대 1
 
2.8%
성서산업단지 1
 
2.8%
수성구청 1
 
2.8%
신매 1
 
2.8%
연호 1
 
2.8%
Other values (21) 21
58.3%
2023-12-13T07:27:18.589406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
8.2%
4
 
4.1%
4
 
4.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
3
 
3.1%
2
 
2.1%
Other values (53) 61
62.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 95
97.9%
Decimal Number 2
 
2.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
8.4%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
2
 
2.1%
Other values (52) 59
62.1%
Decimal Number
ValueCountFrequency (%)
2 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 95
97.9%
Common 2
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
8.4%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
2
 
2.1%
Other values (52) 59
62.1%
Common
ValueCountFrequency (%)
2 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 95
97.9%
ASCII 2
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
8.4%
4
 
4.2%
4
 
4.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
3
 
3.2%
2
 
2.1%
Other values (52) 59
62.1%
ASCII
ValueCountFrequency (%)
2 2
100.0%

지상지하구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size420.0 B
지하
35 
지상
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)2.8%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지하 35
97.2%
지상 1
 
2.8%

Length

2023-12-13T07:27:18.769035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:18.892183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 35
97.2%
지상 1
 
2.8%

역층
Categorical

Distinct4
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size420.0 B
1
17 
2
11 
3
1,2
 
1

Length

Max length3
Median length1
Mean length1.0555556
Min length1

Unique

Unique1 ?
Unique (%)2.8%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 17
47.2%
2 11
30.6%
3 7
19.4%
1,2 1
 
2.8%

Length

2023-12-13T07:27:19.026398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:19.164555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 17
47.2%
2 11
30.6%
3 7
19.4%
1,2 1
 
2.8%

게이트내외
Categorical

Distinct2
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size420.0 B
외부
30 
내부

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row외부
2nd row외부
3rd row외부
4th row외부
5th row외부

Common Values

ValueCountFrequency (%)
외부 30
83.3%
내부 6
 
16.7%

Length

2023-12-13T07:27:19.277812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:19.379208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
외부 30
83.3%
내부 6
 
16.7%

출구번호
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size420.0 B
<NA>
18 
4
3
1
2

Length

Max length4
Median length2.5
Mean length2.5
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row4

Common Values

ValueCountFrequency (%)
<NA> 18
50.0%
4 9
25.0%
3 3
 
8.3%
1 2
 
5.6%
2 2
 
5.6%
9 2
 
5.6%

Length

2023-12-13T07:27:19.487119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:27:19.628790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 18
50.0%
4 9
25.0%
3 3
 
8.3%
1 2
 
5.6%
2 2
 
5.6%
9 2
 
5.6%
Distinct33
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Memory size420.0 B
2023-12-13T07:27:19.818320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length24
Mean length18.583333
Min length10

Characters and Unicode

Total characters669
Distinct characters98
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)83.3%

Sample

1st row(B2) 안심방면 표내는곳 옆
2nd row(B2) 역무실 반대측 표내는 곳 옆
3rd row(B2) 표내는 곳 옆
4th row(B2) 대합실 영남대방면 승강장 내려가는 계단 옆
5th row(B1) 표내는곳 앞
ValueCountFrequency (%)
26
 
14.9%
b1 18
 
10.3%
b2 12
 
6.9%
승강장 11
 
6.3%
내려가는 11
 
6.3%
계단 7
 
4.0%
7
 
4.0%
b3 6
 
3.4%
표내는곳 5
 
2.9%
방면 5
 
2.9%
Other values (40) 66
37.9%
2023-12-13T07:27:20.225819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
139
20.8%
( 37
 
5.5%
) 37
 
5.5%
B 37
 
5.5%
27
 
4.0%
1 20
 
3.0%
19
 
2.8%
19
 
2.8%
16
 
2.4%
13
 
1.9%
Other values (88) 305
45.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 368
55.0%
Space Separator 139
 
20.8%
Decimal Number 45
 
6.7%
Uppercase Letter 40
 
6.0%
Open Punctuation 37
 
5.5%
Close Punctuation 37
 
5.5%
Other Punctuation 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
7.3%
19
 
5.2%
19
 
5.2%
16
 
4.3%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.3%
11
 
3.0%
11
 
3.0%
Other values (76) 214
58.2%
Uppercase Letter
ValueCountFrequency (%)
B 37
92.5%
F 1
 
2.5%
E 1
 
2.5%
V 1
 
2.5%
Decimal Number
ValueCountFrequency (%)
1 20
44.4%
2 13
28.9%
3 8
 
17.8%
4 4
 
8.9%
Space Separator
ValueCountFrequency (%)
139
100.0%
Open Punctuation
ValueCountFrequency (%)
( 37
100.0%
Close Punctuation
ValueCountFrequency (%)
) 37
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 368
55.0%
Common 261
39.0%
Latin 40
 
6.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
7.3%
19
 
5.2%
19
 
5.2%
16
 
4.3%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.3%
11
 
3.0%
11
 
3.0%
Other values (76) 214
58.2%
Common
ValueCountFrequency (%)
139
53.3%
( 37
 
14.2%
) 37
 
14.2%
1 20
 
7.7%
2 13
 
5.0%
3 8
 
3.1%
4 4
 
1.5%
/ 3
 
1.1%
Latin
ValueCountFrequency (%)
B 37
92.5%
F 1
 
2.5%
E 1
 
2.5%
V 1
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 368
55.0%
ASCII 301
45.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
139
46.2%
( 37
 
12.3%
) 37
 
12.3%
B 37
 
12.3%
1 20
 
6.6%
2 13
 
4.3%
3 8
 
2.7%
4 4
 
1.3%
/ 3
 
1.0%
F 1
 
0.3%
Other values (2) 2
 
0.7%
Hangul
ValueCountFrequency (%)
27
 
7.3%
19
 
5.2%
19
 
5.2%
16
 
4.3%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.3%
11
 
3.0%
11
 
3.0%
Other values (76) 214
58.2%

Correlations

2023-12-13T07:27:20.344825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명지상지하구분역층게이트내외출구번호상세위치
역명1.0001.0000.9300.0001.0000.901
지상지하구분1.0001.0000.2940.000NaN1.000
역층0.9300.2941.0000.3100.0001.000
게이트내외0.0000.0000.3101.0000.3241.000
출구번호1.000NaN0.0000.3241.0000.902
상세위치0.9011.0001.0001.0000.9021.000
2023-12-13T07:27:20.458039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지상지하구분역층출구번호게이트내외
지상지하구분1.0000.1831.0000.000
역층0.1831.0000.0000.194
출구번호1.0000.0001.0000.339
게이트내외0.0000.1940.3391.000
2023-12-13T07:27:20.575408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지상지하구분역층게이트내외출구번호
지상지하구분1.0000.1830.0001.000
역층0.1831.0000.1940.000
게이트내외0.0000.1941.0000.339
출구번호1.0000.0000.3391.000

Missing values

2023-12-13T07:27:17.598180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:27:17.705816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상지하구분역층게이트내외출구번호상세위치
0대구교통공사2호선대구역지하2외부<NA>(B2) 안심방면 표내는곳 옆
1대구교통공사2호선중앙로지하2외부<NA>(B2) 역무실 반대측 표내는 곳 옆
2대구교통공사2호선감삼지하2외부<NA>(B2) 표내는 곳 옆
3대구교통공사2호선강창지하2외부<NA>(B2) 대합실 영남대방면 승강장 내려가는 계단 옆
4대구교통공사2호선경대병원지하1외부4(B1) 표내는곳 앞
5대구교통공사2호선계명대지하2외부<NA>(B2) 영남대방면 승강장 내려가는 계단 옆
6대구교통공사2호선고산지하1외부1(B1) 자동매표소 옆
7대구교통공사2호선내당지하1외부4(B1) 표내는곳 옆
8대구교통공사2호선다사지하1외부4(B1) 대합실 4번 출입구 계단 옆
9대구교통공사2호선담티지하1외부4(B1) 역무실 옆
철도운영기관명선명역명지상지하구분역층게이트내외출구번호상세위치
26대구교통공사2호선연호지하1외부1(B1) 승강장 내려가는 엘리베이터 옆
27대구교통공사2호선영남대지하1외부2(B1) 2번 출입구 방향 파우더룸 앞
28대구교통공사2호선용산지하1,2외부4(B1) 영남대방면 승강장 내려가는 계단 전시공간 옆 (B2) 영남대 방면 승강장 내려가는 계단 환기실 옆
29대구교통공사2호선이곡지하1외부4(B1) 4번출구쪽 E/V 옆
30대구교통공사2호선임당지하1외부4(B1) 3번/4번 출입구 사이
31대구교통공사2호선정평지하1외부3(B1) 역무실 앞
32대구교통공사2호선정평지하1내부3(B1) 영남대 방면 승강장 내려가는 엘리베이터 옆
33대구교통공사2호선죽전지하3외부<NA>(B3) 영남대 방면 승강장 내려가는 계단옆
34대구교통공사2호선청라언덕지하1외부9(B1) 역무실 옆
35대구교통공사2호선청라언덕지하1내부9(B1) 지하철 경찰대 출장소 옆