Overview

Dataset statistics

Number of variables5
Number of observations1041
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)1.0%
Total size in memory40.8 KiB
Average record size in memory40.1 B

Variable types

Categorical4
Text1

Dataset

Description수도권3호선에 포함된 도시광역철도역들의 철도운영기관명,선명,역명,출구번호,출구별 주요시설명, 주소 등의 데이터 입니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15073463/fileData.do

Alerts

선명 has constant value ""Constant
Dataset has 10 (1.0%) duplicate rowsDuplicates
역명 is highly overall correlated with 철도운영기관명High correlation
철도운영기관명 is highly overall correlated with 역명High correlation

Reproduction

Analysis started2023-12-12 07:52:17.742672
Analysis finished2023-12-12 07:52:18.302479
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
서울교통공사
921 
코레일
120 

Length

Max length6
Median length6
Mean length5.6541787
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
서울교통공사 921
88.5%
코레일 120
 
11.5%

Length

2023-12-12T16:52:18.367359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:52:18.480748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 921
88.5%
코레일 120
 
11.5%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
3호선
1041 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3호선
2nd row3호선
3rd row3호선
4th row3호선
5th row3호선

Common Values

ValueCountFrequency (%)
3호선 1041
100.0%

Length

2023-12-12T16:52:18.600996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:52:18.708091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3호선 1041
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct40
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
경복궁(정부서울청사)
81 
안국
 
73
양재(서초구청)
 
50
독립문
 
49
매봉
 
48
Other values (35)
740 

Length

Max length12
Median length2
Mean length3.5994236
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주엽
2nd row주엽
3rd row주엽
4th row주엽
5th row주엽

Common Values

ValueCountFrequency (%)
경복궁(정부서울청사) 81
 
7.8%
안국 73
 
7.0%
양재(서초구청) 50
 
4.8%
독립문 49
 
4.7%
매봉 48
 
4.6%
불광 46
 
4.4%
충무로 42
 
4.0%
연신내 36
 
3.5%
약수 34
 
3.3%
수서 31
 
3.0%
Other values (30) 551
52.9%

Length

2023-12-12T16:52:18.816218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경복궁(정부서울청사 81
 
7.8%
안국 73
 
7.0%
양재(서초구청 50
 
4.8%
독립문 49
 
4.7%
매봉 48
 
4.6%
불광 46
 
4.4%
충무로 42
 
4.0%
연신내 36
 
3.5%
약수 34
 
3.3%
수서 31
 
3.0%
Other values (30) 551
52.9%

출구번호
Categorical

Distinct18
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
1
200 
4
179 
3
176 
2
129 
6
106 
Other values (13)
251 

Length

Max length3
Median length1
Mean length1.0336215
Min length1

Unique

Unique4 ?
Unique (%)0.4%

Sample

1st row1
2nd row1
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 200
19.2%
4 179
17.2%
3 176
16.9%
2 129
12.4%
6 106
10.2%
5 104
10.0%
7 68
 
6.5%
8 39
 
3.7%
9 17
 
1.6%
12 6
 
0.6%
Other values (8) 17
 
1.6%

Length

2023-12-12T16:52:18.946735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 200
19.2%
4 179
17.2%
3 176
16.9%
2 129
12.4%
6 106
10.2%
5 104
10.0%
7 68
 
6.5%
8 39
 
3.7%
9 17
 
1.6%
12 6
 
0.6%
Other values (8) 17
 
1.6%
Distinct885
Distinct (%)85.0%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2023-12-12T16:52:19.183997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length18
Mean length6.2862632
Min length2

Characters and Unicode

Total characters6544
Distinct characters398
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique766 ?
Unique (%)73.6%

Sample

1st row장촌초등학교
2nd row인제대학교일산백병원
3rd row대화중학교
4th row교통개발연구원
5th row시설안전기술공단
ValueCountFrequency (%)
아파트 9
 
0.8%
고교 8
 
0.7%
대림아파트 7
 
0.6%
한신아파트 6
 
0.5%
현대아파트 5
 
0.4%
우성아파트 5
 
0.4%
우리은행 5
 
0.4%
한국 4
 
0.3%
방향 4
 
0.3%
주민센터 4
 
0.3%
Other values (940) 1112
95.1%
2023-12-12T16:52:19.547033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
225
 
3.4%
173
 
2.6%
164
 
2.5%
163
 
2.5%
149
 
2.3%
143
 
2.2%
141
 
2.2%
131
 
2.0%
126
 
1.9%
121
 
1.8%
Other values (388) 5008
76.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6188
94.6%
Space Separator 131
 
2.0%
Decimal Number 98
 
1.5%
Uppercase Letter 42
 
0.6%
Other Punctuation 36
 
0.6%
Close Punctuation 22
 
0.3%
Open Punctuation 22
 
0.3%
Dash Punctuation 3
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
225
 
3.6%
173
 
2.8%
164
 
2.7%
163
 
2.6%
149
 
2.4%
143
 
2.3%
141
 
2.3%
126
 
2.0%
121
 
2.0%
119
 
1.9%
Other values (356) 4664
75.4%
Uppercase Letter
ValueCountFrequency (%)
K 7
16.7%
S 6
14.3%
T 5
11.9%
N 2
 
4.8%
G 2
 
4.8%
L 2
 
4.8%
W 2
 
4.8%
I 2
 
4.8%
V 2
 
4.8%
Y 2
 
4.8%
Other values (8) 10
23.8%
Decimal Number
ValueCountFrequency (%)
1 33
33.7%
2 33
33.7%
3 15
15.3%
4 8
 
8.2%
9 3
 
3.1%
7 3
 
3.1%
0 2
 
2.0%
5 1
 
1.0%
Space Separator
ValueCountFrequency (%)
131
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 36
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6188
94.6%
Common 314
 
4.8%
Latin 42
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
225
 
3.6%
173
 
2.8%
164
 
2.7%
163
 
2.6%
149
 
2.4%
143
 
2.3%
141
 
2.3%
126
 
2.0%
121
 
2.0%
119
 
1.9%
Other values (356) 4664
75.4%
Latin
ValueCountFrequency (%)
K 7
16.7%
S 6
14.3%
T 5
11.9%
N 2
 
4.8%
G 2
 
4.8%
L 2
 
4.8%
W 2
 
4.8%
I 2
 
4.8%
V 2
 
4.8%
Y 2
 
4.8%
Other values (8) 10
23.8%
Common
ValueCountFrequency (%)
131
41.7%
/ 36
 
11.5%
1 33
 
10.5%
2 33
 
10.5%
) 22
 
7.0%
( 22
 
7.0%
3 15
 
4.8%
4 8
 
2.5%
9 3
 
1.0%
7 3
 
1.0%
Other values (4) 8
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6188
94.6%
ASCII 356
 
5.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
225
 
3.6%
173
 
2.8%
164
 
2.7%
163
 
2.6%
149
 
2.4%
143
 
2.3%
141
 
2.3%
126
 
2.0%
121
 
2.0%
119
 
1.9%
Other values (356) 4664
75.4%
ASCII
ValueCountFrequency (%)
131
36.8%
/ 36
 
10.1%
1 33
 
9.3%
2 33
 
9.3%
) 22
 
6.2%
( 22
 
6.2%
3 15
 
4.2%
4 8
 
2.2%
K 7
 
2.0%
S 6
 
1.7%
Other values (22) 43
 
12.1%

Correlations

2023-12-12T16:52:19.633169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명출구번호
철도운영기관명1.0001.0000.064
역명1.0001.0000.655
출구번호0.0640.6551.000
2023-12-12T16:52:19.711626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명출구번호철도운영기관명
역명1.0000.2250.982
출구번호0.2251.0000.050
철도운영기관명0.9820.0501.000
2023-12-12T16:52:19.792879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명출구번호
철도운영기관명1.0000.9820.050
역명0.9821.0000.225
출구번호0.0500.2251.000

Missing values

2023-12-12T16:52:18.160880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:52:18.258854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출구번호출구별 주요시설명
0코레일3호선주엽1장촌초등학교
1코레일3호선주엽1인제대학교일산백병원
2코레일3호선주엽2대화중학교
3코레일3호선주엽2교통개발연구원
4코레일3호선주엽2시설안전기술공단
5코레일3호선주엽3고양종합운동장
6코레일3호선주엽3교통개발연구원
7코레일3호선주엽3시설안전기술공단
8코레일3호선주엽4농수산물유통센터
9코레일3호선주엽5장성초/ 중학교
철도운영기관명선명역명출구번호출구별 주요시설명
1031서울교통공사3호선오금4송파경찰서
1032서울교통공사3호선오금5농협중앙회(송파지점)
1033서울교통공사3호선오금5석촌중학교
1034서울교통공사3호선오금5신가초등학교
1035서울교통공사3호선오금6석촌중학교
1036서울교통공사3호선오금6신가초등학교
1037서울교통공사3호선오금6웃말공원
1038서울교통공사3호선오금7송파우체국
1039서울교통공사3호선오금7오금동주민센터
1040서울교통공사3호선오금7현대 2/ 3/ 4차 아파트

Duplicate rows

Most frequently occurring

철도운영기관명선명역명출구번호출구별 주요시설명# duplicates
0서울교통공사3호선녹번2대림아파트2
1서울교통공사3호선독립문2무악현대아파트2
2서울교통공사3호선독립문4삼호아파트2
3서울교통공사3호선독립문4천연동사무소2
4서울교통공사3호선무악재4한양아파트2
5서울교통공사3호선안국3창덕궁2
6서울교통공사3호선안국6국세청2
7서울교통공사3호선연신내3연천초등학교2
8서울교통공사3호선연신내3은혜초등학교2
9서울교통공사3호선연신내6선일여고2