Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

Text4
Numeric1
Categorical2

Dataset

Description외부코드,전철역코드,전철역명,종착역명,출발시간,요일,상/하행선
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-109/S/1/datasetView.do

Reproduction

Analysis started2024-05-18 07:30:43.678783
Analysis finished2024-05-18 07:30:46.170771
Duration2.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct450
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T16:30:47.104505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.0809
Min length3

Characters and Unicode

Total characters30809
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row155
2nd row322
3rd row221
4th row211-2
5th row910
ValueCountFrequency (%)
141 50
 
0.5%
140 50
 
0.5%
135 48
 
0.5%
920 47
 
0.5%
204 46
 
0.5%
936 46
 
0.5%
136 45
 
0.4%
917 45
 
0.4%
206 44
 
0.4%
137 44
 
0.4%
Other values (440) 9535
95.3%
2024-05-18T16:30:48.798153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5352
17.4%
2 4877
15.8%
3 4160
13.5%
4 3893
12.6%
5 3111
10.1%
7 2130
 
6.9%
6 1874
 
6.1%
9 1796
 
5.8%
0 1717
 
5.6%
8 1225
 
4.0%
Other values (2) 674
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 30135
97.8%
Uppercase Letter 539
 
1.7%
Dash Punctuation 135
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5352
17.8%
2 4877
16.2%
3 4160
13.8%
4 3893
12.9%
5 3111
10.3%
7 2130
 
7.1%
6 1874
 
6.2%
9 1796
 
6.0%
0 1717
 
5.7%
8 1225
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
P 539
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30270
98.3%
Latin 539
 
1.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5352
17.7%
2 4877
16.1%
3 4160
13.7%
4 3893
12.9%
5 3111
10.3%
7 2130
 
7.0%
6 1874
 
6.2%
9 1796
 
5.9%
0 1717
 
5.7%
8 1225
 
4.0%
Latin
ValueCountFrequency (%)
P 539
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30809
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5352
17.4%
2 4877
15.8%
3 4160
13.5%
4 3893
12.6%
5 3111
10.1%
7 2130
 
6.9%
6 1874
 
6.1%
9 1796
 
5.8%
0 1717
 
5.6%
8 1225
 
4.0%
Other values (2) 674
 
2.2%

전철역코드
Real number (ℝ)

Distinct450
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1747.2437
Minimum150
Maximum4138
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T16:30:49.411507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile205
Q1336
median1808
Q32644.25
95-th percentile4118
Maximum4138
Range3988
Interquartile range (IQR)2308.25

Descriptive statistics

Standard deviation1271.1999
Coefficient of variation (CV)0.72754586
Kurtosis-1.0837311
Mean1747.2437
Median Absolute Deviation (MAD)939
Skewness0.2315259
Sum17472437
Variance1615949.2
MonotonicityNot monotonic
2024-05-18T16:30:49.968536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1701 50
 
0.5%
1007 50
 
0.5%
1003 48
 
0.5%
4120 47
 
0.5%
204 46
 
0.5%
4136 46
 
0.5%
1004 45
 
0.4%
4117 45
 
0.4%
206 44
 
0.4%
1005 44
 
0.4%
Other values (440) 9535
95.3%
ValueCountFrequency (%)
150 38
0.4%
151 40
0.4%
152 34
0.3%
153 41
0.4%
154 40
0.4%
155 34
0.3%
156 23
0.2%
157 31
0.3%
158 25
0.2%
159 41
0.4%
ValueCountFrequency (%)
4138 16
 
0.2%
4137 16
 
0.2%
4136 46
0.5%
4135 20
0.2%
4134 21
0.2%
4133 35
0.4%
4132 16
 
0.2%
4131 8
 
0.1%
4130 36
0.4%
4129 20
0.2%
Distinct397
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T16:30:50.869703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.8567
Min length2

Characters and Unicode

Total characters28567
Distinct characters251
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row간석
2nd row불광
3rd row역삼
4th row신답
5th row염창
ValueCountFrequency (%)
종로3가 96
 
1.0%
노량진 90
 
0.9%
신도림 83
 
0.8%
동대문역사문화공원 81
 
0.8%
을지로4가 76
 
0.8%
충무로 71
 
0.7%
고속터미널 71
 
0.7%
동대문 70
 
0.7%
동작 69
 
0.7%
대림 68
 
0.7%
Other values (387) 9225
92.2%
2024-05-18T16:30:52.225692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1127
 
3.9%
814
 
2.8%
802
 
2.8%
794
 
2.8%
737
 
2.6%
533
 
1.9%
503
 
1.8%
465
 
1.6%
462
 
1.6%
457
 
1.6%
Other values (241) 21873
76.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28298
99.1%
Decimal Number 269
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1127
 
4.0%
814
 
2.9%
802
 
2.8%
794
 
2.8%
737
 
2.6%
533
 
1.9%
503
 
1.8%
465
 
1.6%
462
 
1.6%
457
 
1.6%
Other values (238) 21604
76.3%
Decimal Number
ValueCountFrequency (%)
3 153
56.9%
4 76
28.3%
5 40
 
14.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28298
99.1%
Common 269
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1127
 
4.0%
814
 
2.9%
802
 
2.8%
794
 
2.8%
737
 
2.6%
533
 
1.9%
503
 
1.8%
465
 
1.6%
462
 
1.6%
457
 
1.6%
Other values (238) 21604
76.3%
Common
ValueCountFrequency (%)
3 153
56.9%
4 76
28.3%
5 40
 
14.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28298
99.1%
ASCII 269
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1127
 
4.0%
814
 
2.9%
802
 
2.8%
794
 
2.8%
737
 
2.6%
533
 
1.9%
503
 
1.8%
465
 
1.6%
462
 
1.6%
457
 
1.6%
Other values (238) 21604
76.3%
ASCII
ValueCountFrequency (%)
3 153
56.9%
4 76
28.3%
5 40
 
14.9%
Distinct89
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T16:30:53.039000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.5652
Min length2

Characters and Unicode

Total characters25652
Distinct characters127
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row인천
2nd row대화
3rd row성수
4th row성수
5th row중앙보훈병원
ValueCountFrequency (%)
성수 1217
 
12.2%
방화 557
 
5.6%
인천 446
 
4.5%
당고개 415
 
4.2%
중앙보훈병원 406
 
4.1%
오금 351
 
3.5%
도봉산 343
 
3.4%
오이도 332
 
3.3%
개화 324
 
3.2%
대화 307
 
3.1%
Other values (79) 5302
53.0%
2024-05-18T16:30:54.244297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1612
 
6.3%
1440
 
5.6%
1241
 
4.8%
1111
 
4.3%
1013
 
3.9%
862
 
3.4%
750
 
2.9%
694
 
2.7%
658
 
2.6%
651
 
2.5%
Other values (117) 15620
60.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25652
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1612
 
6.3%
1440
 
5.6%
1241
 
4.8%
1111
 
4.3%
1013
 
3.9%
862
 
3.4%
750
 
2.9%
694
 
2.7%
658
 
2.6%
651
 
2.5%
Other values (117) 15620
60.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25652
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1612
 
6.3%
1440
 
5.6%
1241
 
4.8%
1111
 
4.3%
1013
 
3.9%
862
 
3.4%
750
 
2.9%
694
 
2.7%
658
 
2.6%
651
 
2.5%
Other values (117) 15620
60.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25652
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1612
 
6.3%
1440
 
5.6%
1241
 
4.8%
1111
 
4.3%
1013
 
3.9%
862
 
3.4%
750
 
2.9%
694
 
2.7%
658
 
2.6%
651
 
2.5%
Other values (117) 15620
60.9%
Distinct2195
Distinct (%)21.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T16:30:55.373632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters80000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique723 ?
Unique (%)7.2%

Sample

1st row19:45:30
2nd row22:43:00
3rd row21:09:30
4th row21:51:00
5th row22:38:05
ValueCountFrequency (%)
19:09:30 23
 
0.2%
19:57:30 23
 
0.2%
20:56:00 22
 
0.2%
19:50:00 22
 
0.2%
19:11:30 21
 
0.2%
19:08:30 21
 
0.2%
19:11:00 21
 
0.2%
19:16:00 21
 
0.2%
20:25:30 21
 
0.2%
20:27:30 21
 
0.2%
Other values (2185) 9784
97.8%
2024-05-18T16:30:56.779115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
: 20000
25.0%
0 17866
22.3%
2 13221
16.5%
1 7822
 
9.8%
3 7480
 
9.3%
5 3795
 
4.7%
4 3598
 
4.5%
9 3184
 
4.0%
7 1079
 
1.3%
6 990
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60000
75.0%
Other Punctuation 20000
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 17866
29.8%
2 13221
22.0%
1 7822
13.0%
3 7480
12.5%
5 3795
 
6.3%
4 3598
 
6.0%
9 3184
 
5.3%
7 1079
 
1.8%
6 990
 
1.7%
8 965
 
1.6%
Other Punctuation
ValueCountFrequency (%)
: 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
: 20000
25.0%
0 17866
22.3%
2 13221
16.5%
1 7822
 
9.8%
3 7480
 
9.3%
5 3795
 
4.7%
4 3598
 
4.5%
9 3184
 
4.0%
7 1079
 
1.3%
6 990
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 20000
25.0%
0 17866
22.3%
2 13221
16.5%
1 7822
 
9.8%
3 7480
 
9.3%
5 3795
 
4.7%
4 3598
 
4.5%
9 3184
 
4.0%
7 1079
 
1.3%
6 990
 
1.2%

요일
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
3799 
2
3148 
3
3053 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row2
5th row3

Common Values

ValueCountFrequency (%)
1 3799
38.0%
2 3148
31.5%
3 3053
30.5%

Length

2024-05-18T16:30:57.281402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:30:57.638006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 3799
38.0%
2 3148
31.5%
3 3053
30.5%

상/하행선
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2
5151 
1
4849 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 5151
51.5%
1 4849
48.5%

Length

2024-05-18T16:30:58.300861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:30:58.653865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 5151
51.5%
1 4849
48.5%

Interactions

2024-05-18T16:30:45.045340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T16:30:59.172586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전철역코드종착역명요일상/하행선
전철역코드1.0000.9500.0430.031
종착역명0.9501.0000.1380.966
요일0.0430.1381.0000.000
상/하행선0.0310.9660.0001.000
2024-05-18T16:30:59.560695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
요일상/하행선
요일1.0000.000
상/하행선0.0001.000
2024-05-18T16:30:59.922434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전철역코드요일상/하행선
전철역코드1.0000.0280.034
요일0.0281.0000.000
상/하행선0.0340.0001.000

Missing values

2024-05-18T16:30:45.541540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T16:30:45.981256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

외부코드전철역코드전철역명종착역명출발시간요일상/하행선
842371551816간석인천19:45:3022
22518322312불광대화22:43:0011
53254221221역삼성수21:09:3012
39027211-2245신답성수21:51:0021
239539104110염창중앙보훈병원22:38:0531
866294401455인덕원오이도19:39:3032
787505252526신길마천19:59:5012
370575112512개화산방화21:57:0021
446827112713수락산온수21:34:2022
449529104110염창중앙보훈병원21:33:3021
외부코드전철역코드전철역명종착역명출발시간요일상/하행선
365951471814소사연천21:58:3031
26883414414수유진접22:28:3011
819966332634약수응암19:51:2021
853885432544장한평마천19:42:3012
68945434434남태령오이도20:26:0012
722931121904망월사인천20:17:0012
26200430430이촌안산22:31:0032
97778133150서울역광운대19:11:0011
608321411701구로소요산20:48:3031
870216142615연신내봉화산19:38:2032