Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells522
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory108.0 B

Variable types

Numeric2
Categorical6
Text2
DateTime2

Dataset

Description서울 도시철도 시각표 데이터를 호선, 역사코드, 역사명, 방향, 도착시간, 출발시간 등의 항목으로 제공하는 서비스(참고 : 역명 "뚝섬유원지 " 2024. 2. 29. 부터 자양으로 개정)주중/주말: DAY:주중, SAT:토요일, END:공휴일,일요일방향: IN/OUT:내/외선, UP/DOWN:상/하행급행여부: 급행:1, 일반:0
Author서울교통공사
URLhttps://www.data.go.kr/data/15098251/fileData.do

Alerts

급행여부 has constant value ""Constant
출발역 is highly overall correlated with 고유번호 and 4 other fieldsHigh correlation
도착역 is highly overall correlated with 고유번호 and 4 other fieldsHigh correlation
호선 is highly overall correlated with 고유번호 and 5 other fieldsHigh correlation
고유번호 is highly overall correlated with 역사코드 and 5 other fieldsHigh correlation
역사코드 is highly overall correlated with 고유번호 and 3 other fieldsHigh correlation
주중주말 is highly overall correlated with 고유번호 and 3 other fieldsHigh correlation
방향 is highly overall correlated with 고유번호 and 3 other fieldsHigh correlation
열차도착시간 has 243 (2.4%) missing valuesMissing
열차출발시간 has 279 (2.8%) missing valuesMissing
고유번호 has unique valuesUnique

Reproduction

Analysis started2024-03-23 07:02:25.896173
Analysis finished2024-03-23 07:02:34.660423
Duration8.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

고유번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50495.76
Minimum12
Maximum99996
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:02:34.882836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile5209.25
Q125403
median50493
Q375870.5
95-th percentile95355.55
Maximum99996
Range99984
Interquartile range (IQR)50467.5

Descriptive statistics

Standard deviation28976.301
Coefficient of variation (CV)0.57383633
Kurtosis-1.2088091
Mean50495.76
Median Absolute Deviation (MAD)25260
Skewness-0.0055463511
Sum5.049576 × 108
Variance8.3962604 × 108
MonotonicityNot monotonic
2024-03-23T07:02:35.536023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51985 1
 
< 0.1%
28917 1
 
< 0.1%
5195 1
 
< 0.1%
64796 1
 
< 0.1%
2964 1
 
< 0.1%
53930 1
 
< 0.1%
49684 1
 
< 0.1%
12023 1
 
< 0.1%
31425 1
 
< 0.1%
51091 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
12 1
< 0.1%
14 1
< 0.1%
37 1
< 0.1%
44 1
< 0.1%
51 1
< 0.1%
65 1
< 0.1%
68 1
< 0.1%
73 1
< 0.1%
74 1
< 0.1%
82 1
< 0.1%
ValueCountFrequency (%)
99996 1
< 0.1%
99982 1
< 0.1%
99970 1
< 0.1%
99960 1
< 0.1%
99948 1
< 0.1%
99945 1
< 0.1%
99922 1
< 0.1%
99890 1
< 0.1%
99886 1
< 0.1%
99879 1
< 0.1%

호선
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
4
3127 
3
2847 
2
2015 
5
1957 
6
 
54

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row3
3rd row5
4th row4
5th row5

Common Values

ValueCountFrequency (%)
4 3127
31.3%
3 2847
28.5%
2 2015
20.2%
5 1957
19.6%
6 54
 
0.5%

Length

2024-03-23T07:02:36.009600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:02:36.308140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 3127
31.3%
3 2847
28.5%
2 2015
20.2%
5 1957
19.6%
6 54
 
0.5%

역사코드
Real number (ℝ)

HIGH CORRELATION 

Distinct195
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean979.4504
Minimum201
Maximum2649
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:02:36.651419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201
5-th percentile211
Q1316
median416
Q31763
95-th percentile2547
Maximum2649
Range2448
Interquartile range (IQR)1447

Descriptive statistics

Standard deviation928.45515
Coefficient of variation (CV)0.94793484
Kurtosis-1.135763
Mean979.4504
Median Absolute Deviation (MAD)183
Skewness0.80913727
Sum9794504
Variance862028.97
MonotonicityNot monotonic
2024-03-23T07:02:37.361224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
410 96
 
1.0%
421 93
 
0.9%
425 93
 
0.9%
432 92
 
0.9%
417 90
 
0.9%
418 90
 
0.9%
429 89
 
0.9%
426 89
 
0.9%
211 87
 
0.9%
320 87
 
0.9%
Other values (185) 9094
90.9%
ValueCountFrequency (%)
201 52
0.5%
202 44
0.4%
203 40
0.4%
204 50
0.5%
205 47
0.5%
206 54
0.5%
207 41
0.4%
208 50
0.5%
209 41
0.4%
210 44
0.4%
ValueCountFrequency (%)
2649 5
 
0.1%
2648 22
0.2%
2647 16
0.2%
2646 11
0.1%
2566 23
0.2%
2565 21
0.2%
2564 15
0.1%
2563 11
0.1%
2562 20
0.2%
2561 23
0.2%
Distinct183
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:02:38.087122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.8731
Min length2

Characters and Unicode

Total characters28731
Distinct characters180
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동대문역사문화공원
2nd row도곡
3rd row양평
4th row당고개
5th row여의나루
ValueCountFrequency (%)
동대문역사문화공원 170
 
1.7%
충무로 155
 
1.6%
을지로3가 127
 
1.3%
사당 121
 
1.2%
교대 114
 
1.1%
종로3가 113
 
1.1%
상계 96
 
1.0%
회현 93
 
0.9%
동대문 93
 
0.9%
총신대입구 92
 
0.9%
Other values (173) 8826
88.3%
2024-03-23T07:02:39.277377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1603
 
5.6%
970
 
3.4%
802
 
2.8%
781
 
2.7%
696
 
2.4%
678
 
2.4%
654
 
2.3%
606
 
2.1%
602
 
2.1%
559
 
1.9%
Other values (170) 20780
72.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28406
98.9%
Decimal Number 325
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1603
 
5.6%
970
 
3.4%
802
 
2.8%
781
 
2.7%
696
 
2.5%
678
 
2.4%
654
 
2.3%
606
 
2.1%
602
 
2.1%
559
 
2.0%
Other values (168) 20455
72.0%
Decimal Number
ValueCountFrequency (%)
3 240
73.8%
4 85
 
26.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28406
98.9%
Common 325
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1603
 
5.6%
970
 
3.4%
802
 
2.8%
781
 
2.7%
696
 
2.5%
678
 
2.4%
654
 
2.3%
606
 
2.1%
602
 
2.1%
559
 
2.0%
Other values (168) 20455
72.0%
Common
ValueCountFrequency (%)
3 240
73.8%
4 85
 
26.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28406
98.9%
ASCII 325
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1603
 
5.6%
970
 
3.4%
802
 
2.8%
781
 
2.7%
696
 
2.5%
678
 
2.4%
654
 
2.3%
606
 
2.1%
602
 
2.1%
559
 
2.0%
Other values (168) 20455
72.0%
ASCII
ValueCountFrequency (%)
3 240
73.8%
4 85
 
26.2%

주중주말
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
DAY
7125 
SAT
2875 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDAY
2nd rowDAY
3rd rowDAY
4th rowSAT
5th rowDAY

Common Values

ValueCountFrequency (%)
DAY 7125
71.2%
SAT 2875
28.7%

Length

2024-03-23T07:02:39.700833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:02:40.100704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
day 7125
71.2%
sat 2875
28.7%

방향
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
UP
4029 
DOWN
3956 
OUT
1027 
IN
988 

Length

Max length4
Median length2
Mean length2.8939
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDOWN
2nd rowDOWN
3rd rowDOWN
4th rowDOWN
5th rowDOWN

Common Values

ValueCountFrequency (%)
UP 4029
40.3%
DOWN 3956
39.6%
OUT 1027
 
10.3%
IN 988
 
9.9%

Length

2024-03-23T07:02:40.701496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:02:41.209523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
up 4029
40.3%
down 3956
39.6%
out 1027
 
10.3%
in 988
 
9.9%

급행여부
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2024-03-23T07:02:41.856877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:02:42.181632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%
Distinct2045
Distinct (%)20.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:02:43.124021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.2101
Min length4

Characters and Unicode

Total characters42101
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186 ?
Unique (%)1.9%

Sample

1st row4319K
2nd row3347
3rd row5043
4th row4509K
5th row5631
ValueCountFrequency (%)
3358 18
 
0.2%
3311 17
 
0.2%
4678k 16
 
0.2%
4533k 16
 
0.2%
3210 16
 
0.2%
4575 15
 
0.1%
4633k 15
 
0.1%
4506k 15
 
0.1%
3324 15
 
0.1%
3294 15
 
0.1%
Other values (2035) 9842
98.4%
2024-03-23T07:02:44.462184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 6180
14.7%
4 5573
13.2%
2 5311
12.6%
5 5280
12.5%
1 4417
10.5%
0 4080
9.7%
6 3390
8.1%
K 2101
 
5.0%
7 2092
 
5.0%
8 1848
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40000
95.0%
Uppercase Letter 2101
 
5.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 6180
15.4%
4 5573
13.9%
2 5311
13.3%
5 5280
13.2%
1 4417
11.0%
0 4080
10.2%
6 3390
8.5%
7 2092
 
5.2%
8 1848
 
4.6%
9 1829
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
K 2101
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40000
95.0%
Latin 2101
 
5.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 6180
15.4%
4 5573
13.9%
2 5311
13.3%
5 5280
13.2%
1 4417
11.0%
0 4080
10.2%
6 3390
8.5%
7 2092
 
5.2%
8 1848
 
4.6%
9 1829
 
4.6%
Latin
ValueCountFrequency (%)
K 2101
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42101
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 6180
14.7%
4 5573
13.2%
2 5311
12.6%
5 5280
12.5%
1 4417
10.5%
0 4080
9.7%
6 3390
8.1%
K 2101
 
5.0%
7 2092
 
5.0%
8 1848
 
4.4%

열차도착시간
Date

MISSING 

Distinct3347
Distinct (%)34.3%
Missing243
Missing (%)2.4%
Memory size156.2 KiB
Minimum2024-03-23 00:00:00
Maximum2024-03-23 23:59:30
2024-03-23T07:02:44.905406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:02:45.400320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

열차출발시간
Date

MISSING 

Distinct3358
Distinct (%)34.5%
Missing279
Missing (%)2.8%
Memory size156.2 KiB
Minimum2024-03-23 00:00:00
Maximum2024-03-23 23:59:40
2024-03-23T07:02:45.768143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:02:46.259607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

출발역
Categorical

HIGH CORRELATION 

Distinct38
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
성수
1919 
당고개
1491 
오금
1357 
방화
1022 
오이도
924 
Other values (33)
3287 

Length

Max length5
Median length2
Mean length2.4273
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row당고개
2nd row대화
3rd row방화
4th row당고개
5th row방화

Common Values

ValueCountFrequency (%)
성수 1919
19.2%
당고개 1491
14.9%
오금 1357
13.6%
방화 1022
10.2%
오이도 924
9.2%
대화 907
9.1%
사당 469
 
4.7%
구파발 426
 
4.3%
마천 426
 
4.3%
하남검단산 382
 
3.8%
Other values (28) 677
 
6.8%

Length

2024-03-23T07:02:46.726727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
성수 1919
19.2%
당고개 1491
14.9%
오금 1357
13.6%
방화 1022
10.2%
오이도 924
9.2%
대화 907
9.1%
사당 469
 
4.7%
구파발 426
 
4.3%
마천 426
 
4.3%
하남검단산 382
 
3.8%
Other values (28) 677
 
6.8%

도착역
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
성수
1911 
당고개
1537 
오금
1191 
대화
1031 
오이도
980 
Other values (29)
3350 

Length

Max length5
Median length2
Mean length2.4618
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row안산
2nd row오금
3rd row하남검단산
4th row오이도
5th row마천

Common Values

ValueCountFrequency (%)
성수 1911
19.1%
당고개 1537
15.4%
오금 1191
11.9%
대화 1031
10.3%
오이도 980
9.8%
방화 899
9.0%
마천 495
 
5.0%
구파발 469
 
4.7%
하남검단산 465
 
4.7%
사당 454
 
4.5%
Other values (24) 568
 
5.7%

Length

2024-03-23T07:02:47.197648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
성수 1911
19.1%
당고개 1537
15.4%
오금 1191
11.9%
대화 1031
10.3%
오이도 980
9.8%
방화 899
9.0%
마천 495
 
5.0%
구파발 469
 
4.7%
하남검단산 465
 
4.7%
사당 454
 
4.5%
Other values (24) 568
 
5.7%

Interactions

2024-03-23T07:02:32.997114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:02:32.479147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:02:33.263047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-23T07:02:32.722065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:02:47.471668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호호선역사코드주중주말방향출발역도착역
고유번호1.0000.9890.8980.9620.9640.9490.947
호선0.9891.0000.9040.4280.6491.0001.000
역사코드0.8980.9041.0000.2800.2560.8560.857
주중주말0.9620.4280.2801.0000.4720.6520.647
방향0.9640.6490.2560.4721.0000.9530.951
출발역0.9491.0000.8560.6520.9531.0000.931
도착역0.9471.0000.8570.6470.9510.9311.000
2024-03-23T07:02:47.784963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
출발역도착역주중주말방향호선
출발역1.0000.4600.5260.8140.998
도착역0.4601.0000.5220.8150.999
주중주말0.5260.5221.0000.3190.521
방향0.8140.8150.3191.0000.580
호선0.9980.9990.5210.5801.000
2024-03-23T07:02:48.064020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호역사코드호선주중주말방향출발역도착역
고유번호1.0000.8770.8490.8320.9010.7270.725
역사코드0.8771.0000.5750.3400.2130.6070.607
호선0.8490.5751.0000.5210.5800.9980.999
주중주말0.8320.3400.5211.0000.3190.5260.522
방향0.9010.2130.5800.3191.0000.8140.815
출발역0.7270.6070.9980.5260.8141.0000.460
도착역0.7250.6070.9990.5220.8150.4601.000

Missing values

2024-03-23T07:02:33.655835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:02:34.196060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-23T07:02:34.490576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

고유번호호선역사코드역사명주중주말방향급행여부열차코드열차도착시간열차출발시간출발역도착역
51984519854422동대문역사문화공원DAYDOWN04319K18:41:0018:41:30당고개안산
26523265243334도곡DAYDOWN0334720:58:0020:58:30대화오금
923019230252523양평DAYDOWN0504308:09:2008:09:50방화하남검단산
73085730864409당고개SATDOWN04509K<NA>05:30:00당고개오이도
933729337352528여의나루DAYDOWN0563114:43:4014:44:00방화마천
847338473452536을지로4가DAYUP0556010:15:5010:16:20마천방화
411934119431948원흥SATUP0320616:01:0016:01:30오금대화
74023740244413쌍문SATDOWN0415120:27:0020:27:30당고개사당
846758467652536을지로4가DAYUP0550206:11:1006:11:40마천방화
47890478913336학여울SATDOWN0322917:50:0017:50:30구파발오금
고유번호호선역사코드역사명주중주말방향급행여부열차코드열차도착시간열차출발시간출발역도착역
60525605264432총신대입구DAYUP0467019:36:3019:37:00오이도당고개
560656072234신도림DAYIN0245021:07:3021:08:30성수성수
659036590441761정왕SATUP0467420:51:3020:52:00오이도당고개
12679126802201시청DAYOUT0210708:13:0008:14:00성수성수
40185401863314홍제SATUP03254K17:55:0017:55:30오금대화
663576635841756중앙SATUP0453008:31:0008:31:30오이도당고개
208982089931956정발산DAYDOWN0314310:22:0010:22:30대화오금
61583615844427숙대입구DAYUP0461415:17:0015:17:30오이도당고개
11404114052207상왕십리DAYOUT0236117:11:3017:12:00성수성수
68739687404431동작SATUP0413218:23:3018:24:00사당당고개