Overview

Dataset statistics

Number of variables10
Number of observations40
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory87.3 B

Variable types

Categorical8
Numeric1
Text1

Dataset

Description수도권7호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 휠체어리프트의 관리번호, 출입구번호, 상세위치, 길이, 폭, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041434/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
길이 has constant value ""Constant
has constant value ""Constant
종료층 is highly overall correlated with 역명 and 2 other fieldsHigh correlation
출입구번호 is highly overall correlated with 휠체어리프트의 관리번호 and 3 other fieldsHigh correlation
시작층 is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
역명 is highly overall correlated with 출입구번호 and 1 other fieldsHigh correlation
휠체어리프트의 관리번호 is highly overall correlated with 출입구번호High correlation
출입구번호 is highly imbalanced (67.7%)Imbalance

Reproduction

Analysis started2023-12-12 18:07:41.824156
Analysis finished2023-12-12 18:07:42.630446
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
서울교통공사
40 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 40
100.0%

Length

2023-12-13T03:07:42.700424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:42.838176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 40
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
7호선
40 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7호선
2nd row7호선
3rd row7호선
4th row7호선
5th row7호선

Common Values

ValueCountFrequency (%)
7호선 40
100.0%

Length

2023-12-13T03:07:42.960567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:43.086387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
7호선 40
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)35.0%
Missing0
Missing (%)0.0%
Memory size452.0 B
남구로
11 
온수(성공회대입구)
고속터미널
가산디지털단지
건대입구
Other values (9)
13 

Length

Max length11
Median length9
Mean length5.25
Min length2

Unique

Unique6 ?
Unique (%)15.0%

Sample

1st row수락산
2nd row수락산
3rd row마들
4th row건대입구
5th row건대입구

Common Values

ValueCountFrequency (%)
남구로 11
27.5%
온수(성공회대입구) 5
12.5%
고속터미널 4
 
10.0%
가산디지털단지 4
 
10.0%
건대입구 3
 
7.5%
총신대입구(이수) 3
 
7.5%
수락산 2
 
5.0%
청담 2
 
5.0%
마들 1
 
2.5%
논현 1
 
2.5%
Other values (4) 4
 
10.0%

Length

2023-12-13T03:07:43.206356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남구로 11
27.5%
온수(성공회대입구 5
12.5%
고속터미널 4
 
10.0%
가산디지털단지 4
 
10.0%
건대입구 3
 
7.5%
총신대입구(이수 3
 
7.5%
수락산 2
 
5.0%
청담 2
 
5.0%
마들 1
 
2.5%
논현 1
 
2.5%
Other values (4) 4
 
10.0%

휠체어리프트의 관리번호
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)27.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.125
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size492.0 B
2023-12-13T03:07:43.624511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile9.05
Maximum11
Range10
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.6524058
Coefficient of variation (CV)0.84876985
Kurtosis1.8639137
Mean3.125
Median Absolute Deviation (MAD)1
Skewness1.5679554
Sum125
Variance7.0352564
MonotonicityNot monotonic
2023-12-13T03:07:43.748601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 14
35.0%
2 8
20.0%
3 6
15.0%
4 4
 
10.0%
5 2
 
5.0%
6 1
 
2.5%
7 1
 
2.5%
8 1
 
2.5%
9 1
 
2.5%
10 1
 
2.5%
ValueCountFrequency (%)
1 14
35.0%
2 8
20.0%
3 6
15.0%
4 4
 
10.0%
5 2
 
5.0%
6 1
 
2.5%
7 1
 
2.5%
8 1
 
2.5%
9 1
 
2.5%
10 1
 
2.5%
ValueCountFrequency (%)
11 1
 
2.5%
10 1
 
2.5%
9 1
 
2.5%
8 1
 
2.5%
7 1
 
2.5%
6 1
 
2.5%
5 2
 
5.0%
4 4
10.0%
3 6
15.0%
2 8
20.0%

출입구번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size452.0 B
<NA>
35 
3
 
1
2
 
1
1
 
1
4
 
1

Length

Max length4
Median length4
Mean length3.625
Min length1

Unique

Unique5 ?
Unique (%)12.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 35
87.5%
3 1
 
2.5%
2 1
 
2.5%
1 1
 
2.5%
4 1
 
2.5%
6 1
 
2.5%

Length

2023-12-13T03:07:43.890783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:44.016965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 35
87.5%
3 1
 
2.5%
2 1
 
2.5%
1 1
 
2.5%
4 1
 
2.5%
6 1
 
2.5%
Distinct38
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size452.0 B
2023-12-13T03:07:44.229681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length22.5
Mean length19.45
Min length8

Characters and Unicode

Total characters778
Distinct characters70
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)90.0%

Sample

1st row(B2)도봉산방향 시/승-대(B3-B2)
2nd row(B2)마들방향 시/승-대(B3-B2)
3rd row(B1)대합실-대합실(B2-B1)
4th row(F2)2환승계단(E/S56호기)(B1-2F)
5th row(B1)3출구전 지하계단(B2-B1)
ValueCountFrequency (%)
b5)대림방향 3
 
4.7%
시/승-대(b2-b1 2
 
3.1%
f2)국철환승 2
 
3.1%
시/승-대(b5-b4 2
 
3.1%
b1)시점측 2
 
3.1%
대-대(b4-b2 2
 
3.1%
b1)국철환승 2
 
3.1%
시/승-대(b3-b2 2
 
3.1%
종/승-대(b5-b4 2
 
3.1%
승/종-대(b4-b3 2
 
3.1%
Other values (42) 43
67.2%
2023-12-13T03:07:44.583429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 92
 
11.8%
( 88
 
11.3%
) 87
 
11.2%
- 64
 
8.2%
1 62
 
8.0%
2 36
 
4.6%
34
 
4.4%
30
 
3.9%
25
 
3.2%
F 23
 
3.0%
Other values (60) 237
30.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 240
30.8%
Decimal Number 139
17.9%
Uppercase Letter 122
15.7%
Open Punctuation 88
 
11.3%
Close Punctuation 87
 
11.2%
Dash Punctuation 64
 
8.2%
Space Separator 25
 
3.2%
Other Punctuation 13
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
 
14.2%
30
 
12.5%
12
 
5.0%
11
 
4.6%
11
 
4.6%
9
 
3.8%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.9%
Other values (43) 102
42.5%
Decimal Number
ValueCountFrequency (%)
1 62
44.6%
2 36
25.9%
4 16
 
11.5%
3 11
 
7.9%
5 10
 
7.2%
6 3
 
2.2%
9 1
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
B 92
75.4%
F 23
 
18.9%
M 3
 
2.5%
S 2
 
1.6%
E 2
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 88
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 64
100.0%
Space Separator
ValueCountFrequency (%)
25
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 416
53.5%
Hangul 240
30.8%
Latin 122
 
15.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
14.2%
30
 
12.5%
12
 
5.0%
11
 
4.6%
11
 
4.6%
9
 
3.8%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.9%
Other values (43) 102
42.5%
Common
ValueCountFrequency (%)
( 88
21.2%
) 87
20.9%
- 64
15.4%
1 62
14.9%
2 36
8.7%
25
 
6.0%
4 16
 
3.8%
/ 13
 
3.1%
3 11
 
2.6%
5 10
 
2.4%
Other values (2) 4
 
1.0%
Latin
ValueCountFrequency (%)
B 92
75.4%
F 23
 
18.9%
M 3
 
2.5%
S 2
 
1.6%
E 2
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 538
69.2%
Hangul 240
30.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 92
17.1%
( 88
16.4%
) 87
16.2%
- 64
11.9%
1 62
11.5%
2 36
 
6.7%
25
 
4.6%
F 23
 
4.3%
4 16
 
3.0%
/ 13
 
2.4%
Other values (7) 32
 
5.9%
Hangul
ValueCountFrequency (%)
34
 
14.2%
30
 
12.5%
12
 
5.0%
11
 
4.6%
11
 
4.6%
9
 
3.8%
8
 
3.3%
8
 
3.3%
8
 
3.3%
7
 
2.9%
Other values (43) 102
42.5%

길이
Categorical

CONSTANT 

Distinct1
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
125
40 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row125
2nd row125
3rd row125
4th row125
5th row125

Common Values

ValueCountFrequency (%)
125 40
100.0%

Length

2023-12-13T03:07:44.711561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:44.826493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
125 40
100.0%


Categorical

CONSTANT 

Distinct1
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
80
40 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row80
2nd row80
3rd row80
4th row80
5th row80

Common Values

ValueCountFrequency (%)
80 40
100.0%

Length

2023-12-13T03:07:44.935890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:45.036698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
80 40
100.0%

시작층
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
지하1
13 
지상1
지하2
지상2
지하5
Other values (2)

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하2
2nd row지하2
3rd row지하1
4th row지상2
5th row지하1

Common Values

ValueCountFrequency (%)
지하1 13
32.5%
지상1 9
22.5%
지하2 6
15.0%
지상2 4
 
10.0%
지하5 4
 
10.0%
지하3 2
 
5.0%
지하4 2
 
5.0%

Length

2023-12-13T03:07:45.130936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:45.249657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 13
32.5%
지상1 9
22.5%
지하2 6
15.0%
지상2 4
 
10.0%
지하5 4
 
10.0%
지하3 2
 
5.0%
지하4 2
 
5.0%

종료층
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Memory size452.0 B
지하1
15 
지하2
11 
지하3
지하4
지상2
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)2.5%

Sample

1st row지하3
2nd row지하3
3rd row지하2
4th row지하1
5th row지하2

Common Values

ValueCountFrequency (%)
지하1 15
37.5%
지하2 11
27.5%
지하3 7
17.5%
지하4 6
 
15.0%
지상2 1
 
2.5%

Length

2023-12-13T03:07:45.375318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:07:45.480751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하1 15
37.5%
지하2 11
27.5%
지하3 7
17.5%
지하4 6
 
15.0%
지상2 1
 
2.5%

Interactions

2023-12-13T03:07:42.211064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:07:45.582246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명휠체어리프트의 관리번호출입구번호상세위치시작층종료층
역명1.0000.0001.0000.0000.8430.866
휠체어리프트의 관리번호0.0001.0001.0000.9790.3980.000
출입구번호1.0001.0001.0001.000NaNNaN
상세위치0.0000.9791.0001.0001.0001.000
시작층0.8430.398NaN1.0001.0000.822
종료층0.8660.000NaN1.0000.8221.000
2023-12-13T03:07:45.696701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종료층출입구번호시작층역명
종료층1.0001.0000.6930.582
출입구번호1.0001.0001.0001.000
시작층0.6931.0001.0000.420
역명0.5821.0000.4201.000
2023-12-13T03:07:45.783857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
휠체어리프트의 관리번호역명출입구번호시작층종료층
휠체어리프트의 관리번호1.0000.0001.0000.1510.000
역명0.0001.0001.0000.4200.582
출입구번호1.0001.0001.0001.0001.000
시작층0.1510.4201.0001.0000.693
종료층0.0000.5821.0000.6931.000

Missing values

2023-12-13T03:07:42.369600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:07:42.566105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
0서울교통공사7호선수락산1<NA>(B2)도봉산방향 시/승-대(B3-B2)12580지하2지하3
1서울교통공사7호선수락산2<NA>(B2)마들방향 시/승-대(B3-B2)12580지하2지하3
2서울교통공사7호선마들1<NA>(B1)대합실-대합실(B2-B1)12580지하1지하2
3서울교통공사7호선건대입구1<NA>(F2)2환승계단(E/S56호기)(B1-2F)12580지상2지하1
4서울교통공사7호선건대입구2<NA>(B1)3출구전 지하계단(B2-B1)12580지하1지하2
5서울교통공사7호선건대입구33(F1)3출구(B1-F1)(병원입구)12580지상1지하1
6서울교통공사7호선청담1<NA>(B3)뚝섬유원지방향 승/종-대(B4-B3)12580지하3지하4
7서울교통공사7호선청담2<NA>(B3)강남구청방향 승/종-대(B4-B3)12580지하3지하4
8서울교통공사7호선논현1<NA>(B1)대-대(B2-B1)12580지하1지하2
9서울교통공사7호선반포1<NA>(B1)대-대(B2-B1)12580지하1지하2
철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
30서울교통공사7호선가산디지털단지2<NA>(B1)23출입구 전 계단(B1-BM1)12580지하1지하1
31서울교통공사7호선가산디지털단지3<NA>(F2)국철환승 하행(승-승)(B1-F2)12580지상2지하1
32서울교통공사7호선가산디지털단지4<NA>(F2)국철환승 상행(승-승)(B1-F2)12580지상2지하1
33서울교통공사7호선광명사거리1<NA>(B2)상선 종/승(B3-B2)12580지하2지하3
34서울교통공사7호선온수(성공회대입구)1<NA>(B1)국철환승 하선 시/승-대(B2-B1)12580지하1지하2
35서울교통공사7호선온수(성공회대입구)2<NA>(B1)국철환승 상선 시/승-대(B2-B1)12580지하1지하2
36서울교통공사7호선온수(성공회대입구)3<NA>(F1)국철환승(인천)(B1-F1)12580지상1지하1
37서울교통공사7호선온수(성공회대입구)4<NA>(F1)국철환승(서울)(B1-F1)12580지상1지하1
38서울교통공사7호선온수(성공회대입구)56(F1)6출입구(B1-F1)12580지상1지하1
39서울교통공사7호선어린이대공원(세종대)1<NA>(F1-F2)12580지상1지상2