Overview

Dataset statistics

Number of variables9
Number of observations99
Missing cells13
Missing cells (%)1.5%
Duplicate rows11
Duplicate rows (%)11.1%
Total size in memory7.3 KiB
Average record size in memory75.3 B

Variable types

Categorical7
Text1
Numeric1

Dataset

Description수도권 1호선 역사들의 환승정보 데이터로 철도운영기관명, 선명, 역명, 환승철도운영기관, 환승선명, 환승이후역명, 환승기점역명, 차량순서, 차량출입문번호의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041051/fileData.do

Alerts

선명 has constant value ""Constant
Dataset has 11 (11.1%) duplicate rowsDuplicates
환승철도운영기관 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
환승기점역명 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
환승선명 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
차량순서 is highly overall correlated with 역명High correlation
철도운영기관명 is highly overall correlated with 역명High correlation
역명 is highly overall correlated with 차량순서 and 5 other fieldsHigh correlation
차량출입문번호 is highly overall correlated with 역명High correlation
환승이후역명 has 13 (13.1%) missing valuesMissing
차량순서 has 3 (3.0%) zerosZeros

Reproduction

Analysis started2023-12-12 20:19:57.963664
Analysis finished2023-12-12 20:19:58.855828
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
코레일
63 
서울교통공사
36 

Length

Max length6
Median length3
Mean length4.0909091
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 63
63.6%
서울교통공사 36
36.4%

Length

2023-12-13T05:19:58.947037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:19:59.065814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 63
63.6%
서울교통공사 36
36.4%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1호선
99 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 99
100.0%

Length

2023-12-13T05:19:59.184190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:19:59.296839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 99
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Memory size924.0 B
서울역
종로3가
신도림
 
6
주안
 
4
창동
 
4
Other values (19)
69 

Length

Max length12
Median length7
Mean length3.1313131
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row회룡
2nd row회룡
3rd row회룡
4th row회룡
5th row도봉산

Common Values

ValueCountFrequency (%)
서울역 8
 
8.1%
종로3가 8
 
8.1%
신도림 6
 
6.1%
주안 4
 
4.0%
창동 4
 
4.0%
석계 4
 
4.0%
회기 4
 
4.0%
용산 4
 
4.0%
노량진 4
 
4.0%
신길 4
 
4.0%
Other values (14) 49
49.5%

Length

2023-12-13T05:19:59.432884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울역 8
 
8.1%
종로3가 8
 
8.1%
신도림 6
 
6.1%
도봉산 4
 
4.0%
시청 4
 
4.0%
동대문 4
 
4.0%
동묘앞 4
 
4.0%
신설동 4
 
4.0%
청량리(서울시립대입구 4
 
4.0%
금정 4
 
4.0%
Other values (14) 49
49.5%

환승철도운영기관
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size924.0 B
서울교통공사
60 
코레일
19 
인천교통공사
의정부경전철
 
4
서울9호선
 
4
Other values (2)
 
4

Length

Max length7
Median length6
Mean length5.4040404
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row의정부경전철
2nd row의정부경전철
3rd row의정부경전철
4th row의정부경전철
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 60
60.6%
코레일 19
 
19.2%
인천교통공사 8
 
8.1%
의정부경전철 4
 
4.0%
서울9호선 4
 
4.0%
우이신설경전철 2
 
2.0%
인천공항철도 2
 
2.0%

Length

2023-12-13T05:19:59.609606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:19:59.786268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 60
60.6%
코레일 19
 
19.2%
인천교통공사 8
 
8.1%
의정부경전철 4
 
4.0%
서울9호선 4
 
4.0%
우이신설경전철 2
 
2.0%
인천공항철도 2
 
2.0%

환승선명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size924.0 B
4호선
16 
경의중앙
14 
7호선
12 
2호선
6호선
Other values (12)
41 

Length

Max length6
Median length3
Mean length3.4949495
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row의정부경전철
2nd row의정부경전철
3rd row의정부경전철
4th row의정부경전철
5th row7호선

Common Values

ValueCountFrequency (%)
4호선 16
16.2%
경의중앙 14
14.1%
7호선 12
12.1%
2호선 8
8.1%
6호선 8
8.1%
5호선 8
8.1%
의정부경전철 4
 
4.0%
9호선 4
 
4.0%
<NA> 4
 
4.0%
인천2호선 4
 
4.0%
Other values (7) 17
17.2%

Length

2023-12-13T05:19:59.991116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4호선 16
16.2%
경의중앙 14
14.1%
7호선 12
12.1%
2호선 8
8.1%
6호선 8
8.1%
5호선 8
8.1%
인천2호선 4
 
4.0%
인천1호선 4
 
4.0%
3호선 4
 
4.0%
na 4
 
4.0%
Other values (7) 17
17.2%

환승이후역명
Text

MISSING 

Distinct43
Distinct (%)50.0%
Missing13
Missing (%)13.1%
Memory size924.0 B
2023-12-13T05:20:00.314077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length3.744186
Min length2

Characters and Unicode

Total characters322
Distinct characters89
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row발곡
2nd row범골
3rd row발곡
4th row범골
5th row수락산
ValueCountFrequency (%)
발곡 2
 
2.3%
동대문역사문화공원 2
 
2.3%
시민공원 2
 
2.3%
남구로 2
 
2.3%
매교 2
 
2.3%
왕십리 2
 
2.3%
회기 2
 
2.3%
보문 2
 
2.3%
신당 2
 
2.3%
창신 2
 
2.3%
Other values (33) 66
76.7%
2023-12-13T05:20:00.790327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
5.0%
14
 
4.3%
12
 
3.7%
( 10
 
3.1%
) 10
 
3.1%
10
 
3.1%
8
 
2.5%
8
 
2.5%
8
 
2.5%
8
 
2.5%
Other values (79) 218
67.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 298
92.5%
Open Punctuation 10
 
3.1%
Close Punctuation 10
 
3.1%
Decimal Number 4
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
5.4%
14
 
4.7%
12
 
4.0%
10
 
3.4%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (75) 198
66.4%
Decimal Number
ValueCountFrequency (%)
3 2
50.0%
4 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 298
92.5%
Common 24
 
7.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
5.4%
14
 
4.7%
12
 
4.0%
10
 
3.4%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (75) 198
66.4%
Common
ValueCountFrequency (%)
( 10
41.7%
) 10
41.7%
3 2
 
8.3%
4 2
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 298
92.5%
ASCII 24
 
7.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
16
 
5.4%
14
 
4.7%
12
 
4.0%
10
 
3.4%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
Other values (75) 198
66.4%
ASCII
ValueCountFrequency (%)
( 10
41.7%
) 10
41.7%
3 2
 
8.3%
4 2
 
8.3%

환승기점역명
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Memory size924.0 B
<NA>
13 
문산
당고개
남태령
용산
Other values (19)
56 

Length

Max length10
Median length8
Mean length3.7171717
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row발곡
2nd row탑석
3rd row발곡
4th row탑석
5th row부평구청

Common Values

ValueCountFrequency (%)
<NA> 13
13.1%
문산 8
 
8.1%
당고개 8
 
8.1%
남태령 8
 
8.1%
용산 6
 
6.1%
충정로(경기대입구) 6
 
6.1%
장암 6
 
6.1%
부평구청 6
 
6.1%
방화 4
 
4.0%
응암 4
 
4.0%
Other values (14) 30
30.3%

Length

2023-12-13T05:20:00.985612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 13
13.1%
당고개 8
 
8.1%
남태령 8
 
8.1%
문산 8
 
8.1%
용산 6
 
6.1%
충정로(경기대입구 6
 
6.1%
장암 6
 
6.1%
부평구청 6
 
6.1%
봉화산(서울의료원 4
 
4.0%
마천 4
 
4.0%
Other values (14) 30
30.3%

차량순서
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2626263
Minimum0
Maximum10
Zeros3
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-13T05:20:01.152468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median6
Q39
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)8

Descriptive statistics

Standard deviation3.671695
Coefficient of variation (CV)0.69769252
Kurtosis-1.614801
Mean5.2626263
Median Absolute Deviation (MAD)4
Skewness0.028463124
Sum521
Variance13.481344
MonotonicityNot monotonic
2023-12-13T05:20:01.315539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 25
25.3%
10 21
21.2%
9 12
12.1%
2 11
11.1%
6 11
11.1%
5 8
 
8.1%
7 6
 
6.1%
0 3
 
3.0%
4 2
 
2.0%
ValueCountFrequency (%)
0 3
 
3.0%
1 25
25.3%
2 11
11.1%
4 2
 
2.0%
5 8
 
8.1%
6 11
11.1%
7 6
 
6.1%
9 12
12.1%
10 21
21.2%
ValueCountFrequency (%)
10 21
21.2%
9 12
12.1%
7 6
 
6.1%
6 11
11.1%
5 8
 
8.1%
4 2
 
2.0%
2 11
11.1%
1 25
25.3%
0 3
 
3.0%

차량출입문번호
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size924.0 B
1
32 
4
28 
2
21 
3
15 
0
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row3
4th row2
5th row3

Common Values

ValueCountFrequency (%)
1 32
32.3%
4 28
28.3%
2 21
21.2%
3 15
15.2%
0 3
 
3.0%

Length

2023-12-13T05:20:01.500007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:20:01.632283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 32
32.3%
4 28
28.3%
2 21
21.2%
3 15
15.2%
0 3
 
3.0%

Interactions

2023-12-13T05:19:58.530938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:20:01.742405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명환승철도운영기관환승선명환승이후역명환승기점역명차량순서차량출입문번호
철도운영기관명1.0001.0000.3300.6511.0000.4750.2760.180
역명1.0001.0000.9620.9961.0000.9410.9100.860
환승철도운영기관0.3300.9621.0001.0001.0001.0000.4310.373
환승선명0.6510.9961.0001.0001.0001.0000.7880.684
환승이후역명1.0001.0001.0001.0001.0000.9980.7340.503
환승기점역명0.4750.9411.0001.0000.9981.0000.6610.255
차량순서0.2760.9100.4310.7880.7340.6611.0000.776
차량출입문번호0.1800.8600.3730.6840.5030.2550.7761.000
2023-12-13T05:20:01.898272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량출입문번호철도운영기관명환승철도운영기관역명환승기점역명환승선명
차량출입문번호1.0000.2160.2450.5620.0980.405
철도운영기관명0.2161.0000.3430.8790.3560.477
환승철도운영기관0.2450.3431.0000.7550.8870.947
역명0.5620.8790.7551.0000.5980.873
환승기점역명0.0980.3560.8870.5981.0000.929
환승선명0.4050.4770.9470.8730.9291.000
2023-12-13T05:20:02.026732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량순서철도운영기관명역명환승철도운영기관환승선명환승기점역명차량출입문번호
차량순서1.0000.2480.5780.2580.4510.3230.326
철도운영기관명0.2481.0000.8790.3430.4770.3560.216
역명0.5780.8791.0000.7550.8730.5980.562
환승철도운영기관0.2580.3430.7551.0000.9470.8870.245
환승선명0.4510.4770.8730.9471.0000.9290.405
환승기점역명0.3230.3560.5980.8870.9291.0000.098
차량출입문번호0.3260.2160.5620.2450.4050.0981.000

Missing values

2023-12-13T05:19:58.656582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:19:58.795159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명환승철도운영기관환승선명환승이후역명환승기점역명차량순서차량출입문번호
0코레일1호선회룡의정부경전철의정부경전철발곡발곡22
1코레일1호선회룡의정부경전철의정부경전철범골탑석22
2코레일1호선회룡의정부경전철의정부경전철발곡발곡93
3코레일1호선회룡의정부경전철의정부경전철범골탑석22
4코레일1호선도봉산서울교통공사7호선수락산부평구청93
5코레일1호선도봉산서울교통공사7호선장암장암93
6코레일1호선도봉산서울교통공사7호선장암장암21
7코레일1호선도봉산서울교통공사7호선수락산부평구청21
8코레일1호선창동서울교통공사4호선노원당고개103
9코레일1호선창동서울교통공사4호선쌍문남태령103
철도운영기관명선명역명환승철도운영기관환승선명환승이후역명환승기점역명차량순서차량출입문번호
89서울교통공사1호선시청서울교통공사2호선충정로(경기대입구)충정로(경기대입구)11
90서울교통공사1호선시청서울교통공사2호선을지로입구충정로(경기대입구)104
91서울교통공사1호선서울역코레일경의중앙신촌문산104
92서울교통공사1호선서울역서울교통공사4호선숙대입구(갈월)남태령104
93서울교통공사1호선서울역서울교통공사4호선회현(남대문시장)당고개104
94서울교통공사1호선서울역코레일경의중앙신촌문산12
95서울교통공사1호선서울역인천공항철도공항철도공덕인천공항1터미널12
96서울교통공사1호선서울역서울교통공사4호선숙대입구(갈월)남태령12
97서울교통공사1호선서울역서울교통공사4호선회현(남대문시장)당고개12
98서울교통공사1호선서울역인천공항철도공항철도공덕인천공항1터미널104

Duplicate rows

Most frequently occurring

철도운영기관명선명역명환승철도운영기관환승선명환승이후역명환승기점역명차량순서차량출입문번호# duplicates
0서울교통공사1호선종로3가서울교통공사3호선안국구파발612
1서울교통공사1호선종로3가서울교통공사3호선을지로3가오금612
2서울교통공사1호선종로3가서울교통공사5호선광화문(세종문화회관)방화612
3서울교통공사1호선종로3가서울교통공사5호선을지로4가마천612
4코레일1호선금정서울교통공사4호선<NA>남태령002
5코레일1호선금정서울교통공사4호선<NA>당고개922
6코레일1호선노량진서울9호선9호선<NA><NA>112
7코레일1호선노량진서울9호선9호선<NA><NA>1042
8코레일1호선부평인천교통공사인천1호선동수국제업무지구542
9코레일1호선부평인천교통공사인천1호선부평시장계양542