Overview

Dataset statistics

Number of variables10
Number of observations149
Missing cells140
Missing cells (%)9.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.2 KiB
Average record size in memory83.9 B

Variable types

Categorical7
Text3

Dataset

Description대구도시철도공사에서 운영하는 도시광역철도역들의 철도운영기관명, 선명, 역명, 지상지하구분, 역층, 출입구번호, 상세위치, 제세동기출력에너지, 제세동기운영방식, 수량의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041484/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
제세동기운영방식 has constant value ""Constant
수량 has constant value ""Constant
지상지하구분 is highly overall correlated with 선명High correlation
선명 is highly overall correlated with 지상지하구분High correlation
제세동기출력에너지 is highly imbalanced (60.7%)Imbalance
출입구번호 has 140 (94.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 06:19:11.432679
Analysis finished2023-12-12 06:19:12.166331
Duration0.73 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
대구교통공사
149 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구교통공사
2nd row대구교통공사
3rd row대구교통공사
4th row대구교통공사
5th row대구교통공사

Common Values

ValueCountFrequency (%)
대구교통공사 149
100.0%

Length

2023-12-12T15:19:12.269822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:12.415895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구교통공사 149
100.0%

선명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1호선
63 
2호선
56 
3호선
30 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 63
42.3%
2호선 56
37.6%
3호선 30
20.1%

Length

2023-12-12T15:19:12.541933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:12.655571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 63
42.3%
2호선 56
37.6%
3호선 30
20.1%

역명
Text

Distinct93
Distinct (%)62.4%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T15:19:13.006634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3892617
Min length2

Characters and Unicode

Total characters505
Distinct characters110
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)26.2%

Sample

1st row각산
2nd row각산
3rd row교대
4th row교대
5th row대곡역
ValueCountFrequency (%)
청라언덕역 3
 
2.0%
반월당 3
 
2.0%
대구은행역 2
 
1.3%
용산역 2
 
1.3%
각산 2
 
1.3%
내당역 2
 
1.3%
교대 2
 
1.3%
대공원역 2
 
1.3%
담티역 2
 
1.3%
다사역 2
 
1.3%
Other values (83) 127
85.2%
2023-12-12T15:19:13.585895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
109
 
21.6%
27
 
5.3%
13
 
2.6%
11
 
2.2%
11
 
2.2%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (100) 288
57.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 505
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
109
 
21.6%
27
 
5.3%
13
 
2.6%
11
 
2.2%
11
 
2.2%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (100) 288
57.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 505
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
109
 
21.6%
27
 
5.3%
13
 
2.6%
11
 
2.2%
11
 
2.2%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (100) 288
57.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 505
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
109
 
21.6%
27
 
5.3%
13
 
2.6%
11
 
2.2%
11
 
2.2%
10
 
2.0%
10
 
2.0%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (100) 288
57.0%

지상지하구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
지하
118 
지상
31 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지하 118
79.2%
지상 31
 
20.8%

Length

2023-12-12T15:19:13.793826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:13.932684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 118
79.2%
지상 31
 
20.8%

역층
Categorical

Distinct5
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2
66 
1
37 
3
35 
4
10 
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.7%

Sample

1st row1
2nd row2
3rd row2
4th row3
5th row2

Common Values

ValueCountFrequency (%)
2 66
44.3%
1 37
24.8%
3 35
23.5%
4 10
 
6.7%
5 1
 
0.7%

Length

2023-12-12T15:19:14.069938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:14.238970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 66
44.3%
1 37
24.8%
3 35
23.5%
4 10
 
6.7%
5 1
 
0.7%

출입구번호
Text

MISSING 

Distinct5
Distinct (%)55.6%
Missing140
Missing (%)94.0%
Memory size1.3 KiB
2023-12-12T15:19:14.417820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.4444444
Min length2

Characters and Unicode

Total characters22
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)33.3%

Sample

1st row3번
2nd row1번
3rd row1번
4th row2번
5th row1번
ValueCountFrequency (%)
1번 4
44.4%
2번 2
22.2%
3번 1
 
11.1%
2번3번 1
 
11.1%
1~4번 1
 
11.1%
2023-12-12T15:19:14.773800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
45.5%
1 5
22.7%
2 3
 
13.6%
3 2
 
9.1%
~ 1
 
4.5%
4 1
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11
50.0%
Other Letter 10
45.5%
Math Symbol 1
 
4.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
45.5%
2 3
27.3%
3 2
 
18.2%
4 1
 
9.1%
Other Letter
ValueCountFrequency (%)
10
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12
54.5%
Hangul 10
45.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
41.7%
2 3
25.0%
3 2
 
16.7%
~ 1
 
8.3%
4 1
 
8.3%
Hangul
ValueCountFrequency (%)
10
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
54.5%
Hangul 10
45.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
100.0%
ASCII
ValueCountFrequency (%)
1 5
41.7%
2 3
25.0%
3 2
 
16.7%
~ 1
 
8.3%
4 1
 
8.3%
Distinct103
Distinct (%)69.1%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T15:19:15.035314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length21
Mean length15.255034
Min length8

Characters and Unicode

Total characters2273
Distinct characters124
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)58.4%

Sample

1st row(B1)고객안내센터 앞
2nd row(B2)반야월역방향 승강장 2-1 앞
3rd row(B2)고객안내센터쪽 발매기 옆
4th row(B3)명덕역방향 3-3 앞
5th row(B2)고객안내센터 앞
ValueCountFrequency (%)
117
26.7%
승강장 53
 
12.1%
30
 
6.8%
b1)고객안내센터 21
 
4.8%
b1)표내는곳 7
 
1.6%
에스컬레이터 7
 
1.6%
3-4 6
 
1.4%
3-3 6
 
1.4%
4-4 6
 
1.4%
f2)표내는곳 6
 
1.4%
Other values (119) 180
41.0%
2023-12-12T15:19:15.419592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
290
 
12.8%
( 147
 
6.5%
) 147
 
6.5%
118
 
5.2%
B 116
 
5.1%
2 89
 
3.9%
74
 
3.3%
74
 
3.3%
3 68
 
3.0%
63
 
2.8%
Other values (114) 1087
47.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1211
53.3%
Space Separator 290
 
12.8%
Decimal Number 271
 
11.9%
Open Punctuation 147
 
6.5%
Close Punctuation 147
 
6.5%
Uppercase Letter 147
 
6.5%
Dash Punctuation 60
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
118
 
9.7%
74
 
6.1%
74
 
6.1%
63
 
5.2%
61
 
5.0%
58
 
4.8%
55
 
4.5%
54
 
4.5%
42
 
3.5%
37
 
3.1%
Other values (102) 575
47.5%
Decimal Number
ValueCountFrequency (%)
2 89
32.8%
3 68
25.1%
1 53
19.6%
4 46
17.0%
5 12
 
4.4%
6 3
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
B 116
78.9%
F 31
 
21.1%
Space Separator
ValueCountFrequency (%)
290
100.0%
Open Punctuation
ValueCountFrequency (%)
( 147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1211
53.3%
Common 915
40.3%
Latin 147
 
6.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
118
 
9.7%
74
 
6.1%
74
 
6.1%
63
 
5.2%
61
 
5.0%
58
 
4.8%
55
 
4.5%
54
 
4.5%
42
 
3.5%
37
 
3.1%
Other values (102) 575
47.5%
Common
ValueCountFrequency (%)
290
31.7%
( 147
16.1%
) 147
16.1%
2 89
 
9.7%
3 68
 
7.4%
- 60
 
6.6%
1 53
 
5.8%
4 46
 
5.0%
5 12
 
1.3%
6 3
 
0.3%
Latin
ValueCountFrequency (%)
B 116
78.9%
F 31
 
21.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1211
53.3%
ASCII 1062
46.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
290
27.3%
( 147
13.8%
) 147
13.8%
B 116
 
10.9%
2 89
 
8.4%
3 68
 
6.4%
- 60
 
5.6%
1 53
 
5.0%
4 46
 
4.3%
F 31
 
2.9%
Other values (2) 15
 
1.4%
Hangul
ValueCountFrequency (%)
118
 
9.7%
74
 
6.1%
74
 
6.1%
63
 
5.2%
61
 
5.0%
58
 
4.8%
55
 
4.5%
54
 
4.5%
42
 
3.5%
37
 
3.1%
Other values (102) 575
47.5%

제세동기출력에너지
Categorical

IMBALANCE 

Distinct3
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
150
132 
200
 
10
180
 
7

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row150
2nd row150
3rd row150
4th row150
5th row180

Common Values

ValueCountFrequency (%)
150 132
88.6%
200 10
 
6.7%
180 7
 
4.7%

Length

2023-12-12T15:19:15.553594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:15.659328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
150 132
88.6%
200 10
 
6.7%
180 7
 
4.7%

제세동기운영방식
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
자동
149 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자동
2nd row자동
3rd row자동
4th row자동
5th row자동

Common Values

ValueCountFrequency (%)
자동 149
100.0%

Length

2023-12-12T15:19:15.780877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:15.878223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자동 149
100.0%

수량
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
1
149 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 149
100.0%

Length

2023-12-12T15:19:15.981351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:19:16.105074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 149
100.0%

Correlations

2023-12-12T15:19:16.173674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명역명지상지하구분역층출입구번호제세동기출력에너지
선명1.0000.9990.7650.4320.5810.236
역명0.9991.0000.9930.0001.0000.000
지상지하구분0.7650.9931.0000.3470.0000.082
역층0.4320.0000.3471.0000.0000.221
출입구번호0.5811.0000.0000.0001.0000.711
제세동기출력에너지0.2360.0000.0820.2210.7111.000
2023-12-12T15:19:16.274681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지상지하구분제세동기출력에너지선명역층
지상지하구분1.0000.1350.9760.419
제세동기출력에너지0.1351.0000.0750.169
선명0.9760.0751.0000.360
역층0.4190.1690.3601.000
2023-12-12T15:19:16.380676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명지상지하구분역층제세동기출력에너지
선명1.0000.9760.3600.075
지상지하구분0.9761.0000.4190.135
역층0.3600.4191.0000.169
제세동기출력에너지0.0750.1350.1691.000

Missing values

2023-12-12T15:19:11.921334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:19:12.090570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상지하구분역층출입구번호상세위치제세동기출력에너지제세동기운영방식수량
0대구교통공사1호선각산지하1<NA>(B1)고객안내센터 앞150자동1
1대구교통공사1호선각산지하2<NA>(B2)반야월역방향 승강장 2-1 앞150자동1
2대구교통공사1호선교대지하2<NA>(B2)고객안내센터쪽 발매기 옆150자동1
3대구교통공사1호선교대지하3<NA>(B3)명덕역방향 3-3 앞150자동1
4대구교통공사1호선대곡역지하2<NA>(B2)고객안내센터 앞180자동1
5대구교통공사1호선대곡역지하3<NA>(B3)진천역방향 승강장 2-3 앞150자동1
6대구교통공사1호선대구역지하2<NA>(B2)설화명곡방면 표내는곳 옆150자동1
7대구교통공사1호선대구역지하3<NA>(B3)중앙로역방향 승강장 2-2 앞150자동1
8대구교통공사1호선대명지하1<NA>(B1)고객안내센터 앞150자동1
9대구교통공사1호선대명역지하2<NA>(B2)안지랑역방향 승강장 4-3 앞150자동1
철도운영기관명선명역명지상지하구분역층출입구번호상세위치제세동기출력에너지제세동기운영방식수량
139대구교통공사3호선지산역지상2<NA>(F2)남자장애인화장실 옆150자동1
140대구교통공사3호선청라언덕역지상2<NA>(F2)칠곡경대병원방면 에스컬레이터 앞180자동1
141대구교통공사3호선칠곡경대병원역지상22번3번(F2)표내는곳 앞150자동1
142대구교통공사3호선칠곡운암역지상21번(F2)표내는곳 앞150자동1
143대구교통공사3호선태전역지상1<NA>(F1)용지방면 계단 옆150자동1
144대구교통공사3호선팔거역지상22번(F2)표내는곳 앞180자동1
145대구교통공사3호선팔달시장역지상2<NA>(F2)에스컬레이터 옆150자동1
146대구교통공사3호선팔달역지상2<NA>(F2)용지방면 에스컬레이터 옆150자동1
147대구교통공사3호선학정역지상21~4번(F2)표내는곳 앞150자동1
148대구교통공사3호선황금역지상2<NA>(F2)안내부스 앞150자동1