Overview

Dataset statistics

Number of variables7
Number of observations150
Missing cells82
Missing cells (%)7.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.8 KiB
Average record size in memory59.9 B

Variable types

Categorical3
Numeric3
Text1

Dataset

Description수도권2호선에 포함된 도시광역철도역들의 엘리베이터 데이터로 철도운영기관명, 선명, 역명, 출입구번호, 상세위치, 정원인원, 정원중량의데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041390/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
정원_인원 is highly overall correlated with 정원_중량(kg)High correlation
정원_중량(kg) is highly overall correlated with 정원_인원High correlation
출입구번호 has 82 (54.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 17:04:46.788073
Analysis finished2023-12-12 17:04:48.453369
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
서울교통공사
150 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
서울교통공사 150
100.0%

Length

2023-12-13T02:04:48.547718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:04:48.688751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울교통공사 150
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2호선
150 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2호선
2nd row2호선
3rd row2호선
4th row2호선
5th row2호선

Common Values

ValueCountFrequency (%)
2호선 150
100.0%

Length

2023-12-13T02:04:48.827452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:04:48.957069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 150
100.0%

역명
Categorical

Distinct49
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
삼성(무역센터)
 
5
뚝섬
 
4
신도림
 
4
합정
 
4
잠실새내
 
4
Other values (44)
129 

Length

Max length11
Median length10
Mean length4.2733333
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강남
2nd row강남
3rd row강남
4th row강남
5th row강변(동서울터미널)

Common Values

ValueCountFrequency (%)
삼성(무역센터) 5
 
3.3%
뚝섬 4
 
2.7%
신도림 4
 
2.7%
합정 4
 
2.7%
잠실새내 4
 
2.7%
을지로입구 4
 
2.7%
대림(구로구청) 4
 
2.7%
문래 4
 
2.7%
잠실(송파구청) 4
 
2.7%
역삼 4
 
2.7%
Other values (39) 109
72.7%

Length

2023-12-13T02:04:49.076760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
삼성(무역센터 5
 
3.3%
잠실(송파구청 4
 
2.7%
강남 4
 
2.7%
용두(동대문구청 4
 
2.7%
성수 4
 
2.7%
신촌 4
 
2.7%
아현 4
 
2.7%
역삼 4
 
2.7%
서초 4
 
2.7%
문래 4
 
2.7%
Other values (39) 109
72.7%

출입구번호
Real number (ℝ)

MISSING 

Distinct11
Distinct (%)16.2%
Missing82
Missing (%)54.7%
Infinite0
Infinite (%)0.0%
Mean4.1029412
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T02:04:49.215835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile9
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7919912
Coefficient of variation (CV)0.68048532
Kurtosis1.1078212
Mean4.1029412
Median Absolute Deviation (MAD)2
Skewness0.95500801
Sum279
Variance7.7952151
MonotonicityNot monotonic
2023-12-13T02:04:49.333280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 16
 
10.7%
6 11
 
7.3%
4 11
 
7.3%
2 9
 
6.0%
5 6
 
4.0%
3 5
 
3.3%
8 3
 
2.0%
10 2
 
1.3%
7 2
 
1.3%
9 2
 
1.3%
(Missing) 82
54.7%
ValueCountFrequency (%)
1 16
10.7%
2 9
6.0%
3 5
 
3.3%
4 11
7.3%
5 6
 
4.0%
6 11
7.3%
7 2
 
1.3%
8 3
 
2.0%
9 2
 
1.3%
10 2
 
1.3%
ValueCountFrequency (%)
14 1
 
0.7%
10 2
 
1.3%
9 2
 
1.3%
8 3
 
2.0%
7 2
 
1.3%
6 11
7.3%
5 6
4.0%
4 11
7.3%
3 5
3.3%
2 9
6.0%
Distinct87
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-13T02:04:49.594598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length13.006667
Min length11

Characters and Unicode

Total characters1951
Distinct characters31
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)40.0%

Sample

1st row(B1-B2) 8-1
2nd row(B1-B2) 3-4
3rd row(B1-F1)1번 출입구측
4th row(B1-F1)10번 출입구측
5th row(F1-F2) 4-2
ValueCountFrequency (%)
출입구측 54
18.1%
b1-b2 44
 
14.7%
b1-f1)1번 11
 
3.7%
f2-f3 11
 
3.7%
출입구 9
 
3.0%
3-2 8
 
2.7%
b1-f1)6번 8
 
2.7%
3-4 8
 
2.7%
f1-f2 7
 
2.3%
b1-f1)4번 6
 
2.0%
Other values (69) 133
44.5%
2023-12-13T02:04:50.021843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 235
12.0%
1 222
11.4%
B 179
9.2%
( 167
 
8.6%
) 167
 
8.6%
150
 
7.7%
2 127
 
6.5%
F 121
 
6.2%
3 76
 
3.9%
68
 
3.5%
Other values (21) 439
22.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 543
27.8%
Other Letter 384
19.7%
Uppercase Letter 300
15.4%
Dash Punctuation 235
12.0%
Open Punctuation 167
 
8.6%
Close Punctuation 167
 
8.6%
Space Separator 150
 
7.7%
Math Symbol 4
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
68
17.7%
68
17.7%
68
17.7%
68
17.7%
56
14.6%
16
 
4.2%
16
 
4.2%
13
 
3.4%
3
 
0.8%
3
 
0.8%
Other values (3) 5
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 222
40.9%
2 127
23.4%
3 76
 
14.0%
4 54
 
9.9%
8 18
 
3.3%
6 16
 
2.9%
7 16
 
2.9%
5 9
 
1.7%
9 3
 
0.6%
0 2
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
B 179
59.7%
F 121
40.3%
Dash Punctuation
ValueCountFrequency (%)
- 235
100.0%
Open Punctuation
ValueCountFrequency (%)
( 167
100.0%
Close Punctuation
ValueCountFrequency (%)
) 167
100.0%
Space Separator
ValueCountFrequency (%)
150
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1267
64.9%
Hangul 384
 
19.7%
Latin 300
 
15.4%

Most frequent character per script

Common
ValueCountFrequency (%)
- 235
18.5%
1 222
17.5%
( 167
13.2%
) 167
13.2%
150
11.8%
2 127
10.0%
3 76
 
6.0%
4 54
 
4.3%
8 18
 
1.4%
6 16
 
1.3%
Other values (6) 35
 
2.8%
Hangul
ValueCountFrequency (%)
68
17.7%
68
17.7%
68
17.7%
68
17.7%
56
14.6%
16
 
4.2%
16
 
4.2%
13
 
3.4%
3
 
0.8%
3
 
0.8%
Other values (3) 5
 
1.3%
Latin
ValueCountFrequency (%)
B 179
59.7%
F 121
40.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1567
80.3%
Hangul 384
 
19.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 235
15.0%
1 222
14.2%
B 179
11.4%
( 167
10.7%
) 167
10.7%
150
9.6%
2 127
8.1%
F 121
7.7%
3 76
 
4.9%
4 54
 
3.4%
Other values (8) 69
 
4.4%
Hangul
ValueCountFrequency (%)
68
17.7%
68
17.7%
68
17.7%
68
17.7%
56
14.6%
16
 
4.2%
16
 
4.2%
13
 
3.4%
3
 
0.8%
3
 
0.8%
Other values (3) 5
 
1.3%

정원_인원
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.966667
Minimum8
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T02:04:50.152164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile10.45
Q113
median15
Q315
95-th percentile15
Maximum24
Range16
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.1501262
Coefficient of variation (CV)0.15394698
Kurtosis3.2684551
Mean13.966667
Median Absolute Deviation (MAD)0
Skewness-0.21905451
Sum2095
Variance4.6230425
MonotonicityNot monotonic
2023-12-13T02:04:50.281940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
15 99
66.0%
11 23
 
15.3%
13 16
 
10.7%
9 5
 
3.3%
17 2
 
1.3%
8 2
 
1.3%
20 1
 
0.7%
10 1
 
0.7%
24 1
 
0.7%
ValueCountFrequency (%)
8 2
 
1.3%
9 5
 
3.3%
10 1
 
0.7%
11 23
 
15.3%
13 16
 
10.7%
15 99
66.0%
17 2
 
1.3%
20 1
 
0.7%
24 1
 
0.7%
ValueCountFrequency (%)
24 1
 
0.7%
20 1
 
0.7%
17 2
 
1.3%
15 99
66.0%
13 16
 
10.7%
11 23
 
15.3%
10 1
 
0.7%
9 5
 
3.3%
8 2
 
1.3%

정원_중량(kg)
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean947.66667
Minimum600
Maximum1350
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-13T02:04:50.391503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum600
5-th percentile750
Q11000
median1000
Q31000
95-th percentile1000
Maximum1350
Range750
Interquartile range (IQR)0

Descriptive statistics

Standard deviation116.77097
Coefficient of variation (CV)0.12321945
Kurtosis2.0844501
Mean947.66667
Median Absolute Deviation (MAD)0
Skewness-1.2758602
Sum142150
Variance13635.459
MonotonicityNot monotonic
2023-12-13T02:04:50.514353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1000 114
76.0%
750 23
 
15.3%
900 6
 
4.0%
600 5
 
3.3%
1350 1
 
0.7%
1150 1
 
0.7%
ValueCountFrequency (%)
600 5
 
3.3%
750 23
 
15.3%
900 6
 
4.0%
1000 114
76.0%
1150 1
 
0.7%
1350 1
 
0.7%
ValueCountFrequency (%)
1350 1
 
0.7%
1150 1
 
0.7%
1000 114
76.0%
900 6
 
4.0%
750 23
 
15.3%
600 5
 
3.3%

Interactions

2023-12-13T02:04:47.825848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.174471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.486553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.946047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.273963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.583624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:48.079710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.379074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:04:47.694024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:04:50.613667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명출입구번호상세위치정원_인원정원_중량(kg)
역명1.0000.0000.8580.8200.869
출입구번호0.0001.0001.0000.0000.000
상세위치0.8581.0001.0000.9230.973
정원_인원0.8200.0000.9231.0000.873
정원_중량(kg)0.8690.0000.9730.8731.000
2023-12-13T02:04:50.716851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출입구번호정원_인원정원_중량(kg)역명
출입구번호1.000-0.0510.0830.000
정원_인원-0.0511.0000.8250.394
정원_중량(kg)0.0830.8251.0000.470
역명0.0000.3940.4701.000

Missing values

2023-12-13T02:04:48.236961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:04:48.392905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명출입구번호상세위치정원_인원정원_중량(kg)
0서울교통공사2호선강남<NA>(B1-B2) 8-1151000
1서울교통공사2호선강남<NA>(B1-B2) 3-4151000
2서울교통공사2호선강남1(B1-F1)1번 출입구측151000
3서울교통공사2호선강남10(B1-F1)10번 출입구측131000
4서울교통공사2호선강변(동서울터미널)<NA>(F1-F2) 4-2151000
5서울교통공사2호선강변(동서울터미널)<NA>(F1-F2) 7-3151000
6서울교통공사2호선건대입구<NA>(F2-F3)섬식(외)5-4151000
7서울교통공사2호선건대입구6(F2-F3)7-3 ~6번 출입구측201350
8서울교통공사2호선교대(법원·검찰청)<NA>(B1-B2) 8-4151000
9서울교통공사2호선교대(법원·검찰청)<NA>(B1-B2) 3-1151000
철도운영기관명선명역명출입구번호상세위치정원_인원정원_중량(kg)
140서울교통공사2호선충정로(경기대입구)<NA>(B3-B2)섬식(외) 2-113900
141서울교통공사2호선충정로(경기대입구)5(B2-F1)5번 출입구측9600
142서울교통공사2호선한양대<NA>(F1-F2)섬식(외) 3-38600
143서울교통공사2호선한양대4(F1-F2)4번 출입구(육교)151000
144서울교통공사2호선합정<NA>(B1-B2) 4-1151000
145서울교통공사2호선합정<NA>(B1-B2) 7-3151000
146서울교통공사2호선합정1(B1-F1)1번 출입구측151000
147서울교통공사2호선합정4(B1-F1)4번 출입구측151000
148서울교통공사2호선홍대입구<NA>(B1-B2)섬식(외)7-2151000
149서울교통공사2호선홍대입구8(B1-F1)8번 출입구측151000