Overview

Dataset statistics

Number of variables5
Number of observations2035
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows126
Duplicate rows (%)6.2%
Total size in memory81.6 KiB
Average record size in memory41.1 B

Variable types

Categorical2
Text2
Numeric1

Dataset

Description부산교통공사에서 관리하는 도시광역철도역들의 철도운영기관명,선명,역명,출구번호,출구별 주요시설명 등의 데이터 입니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15073467/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
Dataset has 126 (6.2%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 00:34:31.563811
Analysis finished2023-12-12 00:34:32.034785
Duration0.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
부산교통공사
2035 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 2035
100.0%

Length

2023-12-12T09:34:32.085814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:34:32.171394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 2035
100.0%

선명
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
1호선
965 
2호선
813 
3호선
152 
4호선
105 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 965
47.4%
2호선 813
40.0%
3호선 152
 
7.5%
4호선 105
 
5.2%

Length

2023-12-12T09:34:32.246830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:34:32.328432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 965
47.4%
2호선 813
40.0%
3호선 152
 
7.5%
4호선 105
 
5.2%

역명
Text

Distinct102
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
2023-12-12T09:34:32.556257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length2
Mean length5.0574939
Min length2

Characters and Unicode

Total characters10292
Distinct characters158
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다대포해수욕장
2nd row다대포해수욕장
3rd row다대포해수욕장
4th row다대포해수욕장
5th row다대포해수욕장
ValueCountFrequency (%)
연산 178
 
8.7%
센텀시티(bexco·신세계 145
 
7.1%
중앙 62
 
3.0%
덕천(부산과기대 56
 
2.8%
하단(부산본병원 43
 
2.1%
서면 39
 
1.9%
종합운동장 39
 
1.9%
자갈치 39
 
1.9%
범일 35
 
1.7%
토성 33
 
1.6%
Other values (92) 1366
67.1%
2023-12-12T09:34:32.913894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 625
 
6.1%
( 625
 
6.1%
544
 
5.3%
413
 
4.0%
362
 
3.5%
· 239
 
2.3%
222
 
2.2%
221
 
2.1%
213
 
2.1%
212
 
2.1%
Other values (148) 6616
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8033
78.1%
Uppercase Letter 770
 
7.5%
Close Punctuation 625
 
6.1%
Open Punctuation 625
 
6.1%
Other Punctuation 239
 
2.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
544
 
6.8%
413
 
5.1%
362
 
4.5%
222
 
2.8%
221
 
2.8%
213
 
2.7%
212
 
2.6%
191
 
2.4%
169
 
2.1%
153
 
1.9%
Other values (138) 5333
66.4%
Uppercase Letter
ValueCountFrequency (%)
B 160
20.8%
X 145
18.8%
E 145
18.8%
O 145
18.8%
C 145
18.8%
S 15
 
1.9%
K 15
 
1.9%
Close Punctuation
ValueCountFrequency (%)
) 625
100.0%
Open Punctuation
ValueCountFrequency (%)
( 625
100.0%
Other Punctuation
ValueCountFrequency (%)
· 239
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8033
78.1%
Common 1489
 
14.5%
Latin 770
 
7.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
544
 
6.8%
413
 
5.1%
362
 
4.5%
222
 
2.8%
221
 
2.8%
213
 
2.7%
212
 
2.6%
191
 
2.4%
169
 
2.1%
153
 
1.9%
Other values (138) 5333
66.4%
Latin
ValueCountFrequency (%)
B 160
20.8%
X 145
18.8%
E 145
18.8%
O 145
18.8%
C 145
18.8%
S 15
 
1.9%
K 15
 
1.9%
Common
ValueCountFrequency (%)
) 625
42.0%
( 625
42.0%
· 239
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8033
78.1%
ASCII 2020
 
19.6%
None 239
 
2.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 625
30.9%
( 625
30.9%
B 160
 
7.9%
X 145
 
7.2%
E 145
 
7.2%
O 145
 
7.2%
C 145
 
7.2%
S 15
 
0.7%
K 15
 
0.7%
Hangul
ValueCountFrequency (%)
544
 
6.8%
413
 
5.1%
362
 
4.5%
222
 
2.8%
221
 
2.8%
213
 
2.7%
212
 
2.6%
191
 
2.4%
169
 
2.1%
153
 
1.9%
Other values (138) 5333
66.4%
None
ValueCountFrequency (%)
· 239
100.0%

출구번호
Real number (ℝ)

Distinct17
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.034398
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.0 KiB
2023-12-12T09:34:33.008831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37
95-th percentile12
Maximum17
Range16
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.5895341
Coefficient of variation (CV)0.71300165
Kurtosis0.51353831
Mean5.034398
Median Absolute Deviation (MAD)2
Skewness1.0143951
Sum10245
Variance12.884755
MonotonicityNot monotonic
2023-12-12T09:34:33.100732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
2 307
15.1%
1 301
14.8%
3 261
12.8%
4 251
12.3%
5 166
8.2%
6 147
7.2%
7 147
7.2%
8 112
 
5.5%
9 74
 
3.6%
11 68
 
3.3%
Other values (7) 201
9.9%
ValueCountFrequency (%)
1 301
14.8%
2 307
15.1%
3 261
12.8%
4 251
12.3%
5 166
8.2%
6 147
7.2%
7 147
7.2%
8 112
 
5.5%
9 74
 
3.6%
10 63
 
3.1%
ValueCountFrequency (%)
17 19
 
0.9%
16 5
 
0.2%
15 10
 
0.5%
14 17
 
0.8%
13 27
 
1.3%
12 60
2.9%
11 68
3.3%
10 63
3.1%
9 74
3.6%
8 112
5.5%
Distinct1330
Distinct (%)65.4%
Missing1
Missing (%)< 0.1%
Memory size16.0 KiB
2023-12-12T09:34:33.311257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length18
Mean length6.9803343
Min length2

Characters and Unicode

Total characters14198
Distinct characters437
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique957 ?
Unique (%)47.1%

Sample

1st row부산은행
2nd row중현초등학교
3rd row우체국
4th row다대도서관
5th row다대포생태체험학습장
ValueCountFrequency (%)
부산은행 39
 
1.6%
성형외과 27
 
1.1%
국민건강보험공단 20
 
0.8%
외환은행 16
 
0.7%
국민연금공단 15
 
0.6%
연산동지점 13
 
0.5%
vs 11
 
0.5%
부산광역시 11
 
0.5%
여성의원 10
 
0.4%
해운대 10
 
0.4%
Other values (1409) 2192
92.7%
2023-12-12T09:34:33.621239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
537
 
3.8%
521
 
3.7%
428
 
3.0%
423
 
3.0%
375
 
2.6%
361
 
2.5%
287
 
2.0%
258
 
1.8%
250
 
1.8%
225
 
1.6%
Other values (427) 10533
74.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13256
93.4%
Space Separator 375
 
2.6%
Decimal Number 245
 
1.7%
Uppercase Letter 146
 
1.0%
Other Punctuation 61
 
0.4%
Close Punctuation 38
 
0.3%
Open Punctuation 38
 
0.3%
Lowercase Letter 32
 
0.2%
Other Symbol 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
537
 
4.1%
521
 
3.9%
428
 
3.2%
423
 
3.2%
361
 
2.7%
287
 
2.2%
258
 
1.9%
250
 
1.9%
225
 
1.7%
217
 
1.6%
Other values (382) 9749
73.5%
Uppercase Letter
ValueCountFrequency (%)
C 29
19.9%
S 26
17.8%
V 23
15.8%
B 14
9.6%
G 13
8.9%
M 7
 
4.8%
K 6
 
4.1%
F 4
 
2.7%
P 4
 
2.7%
A 3
 
2.1%
Other values (10) 17
11.6%
Lowercase Letter
ValueCountFrequency (%)
s 8
25.0%
e 4
12.5%
m 4
12.5%
k 3
 
9.4%
o 3
 
9.4%
n 2
 
6.2%
b 2
 
6.2%
x 2
 
6.2%
c 2
 
6.2%
t 1
 
3.1%
Decimal Number
ValueCountFrequency (%)
1 95
38.8%
2 66
26.9%
3 32
 
13.1%
4 20
 
8.2%
9 16
 
6.5%
5 9
 
3.7%
6 4
 
1.6%
0 3
 
1.2%
Other Punctuation
ValueCountFrequency (%)
/ 59
96.7%
& 2
 
3.3%
Space Separator
ValueCountFrequency (%)
375
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13263
93.4%
Common 757
 
5.3%
Latin 178
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
537
 
4.0%
521
 
3.9%
428
 
3.2%
423
 
3.2%
361
 
2.7%
287
 
2.2%
258
 
1.9%
250
 
1.9%
225
 
1.7%
217
 
1.6%
Other values (383) 9756
73.6%
Latin
ValueCountFrequency (%)
C 29
16.3%
S 26
14.6%
V 23
12.9%
B 14
 
7.9%
G 13
 
7.3%
s 8
 
4.5%
M 7
 
3.9%
K 6
 
3.4%
e 4
 
2.2%
m 4
 
2.2%
Other values (21) 44
24.7%
Common
ValueCountFrequency (%)
375
49.5%
1 95
 
12.5%
2 66
 
8.7%
/ 59
 
7.8%
) 38
 
5.0%
( 38
 
5.0%
3 32
 
4.2%
4 20
 
2.6%
9 16
 
2.1%
5 9
 
1.2%
Other values (3) 9
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13256
93.4%
ASCII 935
 
6.6%
None 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
537
 
4.1%
521
 
3.9%
428
 
3.2%
423
 
3.2%
361
 
2.7%
287
 
2.2%
258
 
1.9%
250
 
1.9%
225
 
1.7%
217
 
1.6%
Other values (382) 9749
73.5%
ASCII
ValueCountFrequency (%)
375
40.1%
1 95
 
10.2%
2 66
 
7.1%
/ 59
 
6.3%
) 38
 
4.1%
( 38
 
4.1%
3 32
 
3.4%
C 29
 
3.1%
S 26
 
2.8%
V 23
 
2.5%
Other values (34) 154
16.5%
None
ValueCountFrequency (%)
7
100.0%

Interactions

2023-12-12T09:34:31.837214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:34:33.695791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명출구번호
선명1.0000.202
출구번호0.2021.000
2023-12-12T09:34:33.760737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출구번호선명
출구번호1.0000.123
선명0.1231.000

Missing values

2023-12-12T09:34:31.932558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:34:32.003495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출구번호출구별 주요시설명
0부산교통공사1호선다대포해수욕장1부산은행
1부산교통공사1호선다대포해수욕장1중현초등학교
2부산교통공사1호선다대포해수욕장1우체국
3부산교통공사1호선다대포해수욕장1다대도서관
4부산교통공사1호선다대포해수욕장1다대포생태체험학습장
5부산교통공사1호선다대포해수욕장1다대포해수욕장
6부산교통공사1호선다대포해수욕장1몰운대종합사회복지관
7부산교통공사1호선다대포해수욕장3119안전센터
8부산교통공사1호선다대포해수욕장4다대포해변공원
9부산교통공사1호선다대포해수욕장4다대포생선회먹거리타운
철도운영기관명선명역명출구번호출구별 주요시설명
2025부산교통공사4호선윗반송2송운초등학교
2026부산교통공사4호선고촌2운봉우체국
2027부산교통공사4호선고촌4안평방면
2028부산교통공사4호선고촌1반송방면
2029부산교통공사4호선안평(고촌주택단지)4안평마을
2030부산교통공사4호선안평(고촌주택단지)1한국수력원자력 ㈜
2031부산교통공사4호선안평(고촌주택단지)3고촌마을
2032부산교통공사4호선안평(고촌주택단지)3고불사
2033부산교통공사4호선안평(고촌주택단지)3신진초등학교
2034부산교통공사4호선안평(고촌주택단지)2반송방면

Duplicate rows

Most frequently occurring

철도운영기관명선명역명출구번호출구별 주요시설명# duplicates
89부산교통공사2호선센텀시티(BEXCO·신세계)9길흉부외과의원4
96부산교통공사2호선센텀시티(BEXCO·신세계)11VS 성형외과4
24부산교통공사1호선연산4서울ms치과3
34부산교통공사1호선연산8미소아동여성병원3
35부산교통공사1호선연산8부산경상대학교3
37부산교통공사1호선연산8연산중학교3
38부산교통공사1호선연산8연일시장3
43부산교통공사1호선연산10연산4동주민센터3
45부산교통공사1호선연산12CS연합치과3
46부산교통공사1호선연산12대한웰니스병원3