Overview

Dataset statistics

Number of variables6
Number of observations140
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory51.9 B

Variable types

Numeric3
Text2
Categorical1

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13290/F/1/datasetView.do

Alerts

연번 is highly overall correlated with 호선High correlation
호선 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-29 16:49:03.033298
Analysis finished2024-04-29 16:49:04.471985
Duration1.44 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct140
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.5
Minimum1
Maximum140
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2024-04-30T01:49:04.532468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7.95
Q135.75
median70.5
Q3105.25
95-th percentile133.05
Maximum140
Range139
Interquartile range (IQR)69.5

Descriptive statistics

Standard deviation40.5586
Coefficient of variation (CV)0.57529928
Kurtosis-1.2
Mean70.5
Median Absolute Deviation (MAD)35
Skewness0
Sum9870
Variance1645
MonotonicityStrictly increasing
2024-04-30T01:49:04.659334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.7%
98 1
 
0.7%
92 1
 
0.7%
93 1
 
0.7%
94 1
 
0.7%
95 1
 
0.7%
96 1
 
0.7%
97 1
 
0.7%
99 1
 
0.7%
90 1
 
0.7%
Other values (130) 130
92.9%
ValueCountFrequency (%)
1 1
0.7%
2 1
0.7%
3 1
0.7%
4 1
0.7%
5 1
0.7%
6 1
0.7%
7 1
0.7%
8 1
0.7%
9 1
0.7%
10 1
0.7%
ValueCountFrequency (%)
140 1
0.7%
139 1
0.7%
138 1
0.7%
137 1
0.7%
136 1
0.7%
135 1
0.7%
134 1
0.7%
133 1
0.7%
132 1
0.7%
131 1
0.7%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.2714286
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2024-04-30T01:49:04.768352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.1916876
Coefficient of variation (CV)0.51310411
Kurtosis-0.94237257
Mean4.2714286
Median Absolute Deviation (MAD)2
Skewness0.20902823
Sum598
Variance4.8034943
MonotonicityIncreasing
2024-04-30T01:49:04.869616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 27
19.3%
5 25
17.9%
6 18
12.9%
3 16
11.4%
4 15
10.7%
7 15
10.7%
1 14
10.0%
8 6
 
4.3%
9 4
 
2.9%
ValueCountFrequency (%)
1 14
10.0%
2 27
19.3%
3 16
11.4%
4 15
10.7%
5 25
17.9%
6 18
12.9%
7 15
10.7%
8 6
 
4.3%
9 4
 
2.9%
ValueCountFrequency (%)
9 4
 
2.9%
8 6
 
4.3%
7 15
10.7%
6 18
12.9%
5 25
17.9%
4 15
10.7%
3 16
11.4%
2 27
19.3%
1 14
10.0%
Distinct73
Distinct (%)52.1%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2024-04-30T01:49:05.085119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.3071429
Min length2

Characters and Unicode

Total characters463
Distinct characters110
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)21.4%

Sample

1st row서울역
2nd row서울역
3rd row서울역
4th row시청
5th row종로3가
ValueCountFrequency (%)
종로3가 6
 
4.3%
왕십리 6
 
4.3%
공덕 6
 
4.3%
동대문역사문화공원 6
 
4.3%
신설동 4
 
2.9%
고속터미널 4
 
2.9%
서울역 3
 
2.1%
김포공항 3
 
2.1%
청량리 3
 
2.1%
서울 3
 
2.1%
Other values (63) 96
68.6%
2024-04-30T01:49:05.412635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
4.1%
18
 
3.9%
18
 
3.9%
15
 
3.2%
14
 
3.0%
14
 
3.0%
14
 
3.0%
13
 
2.8%
11
 
2.4%
11
 
2.4%
Other values (100) 316
68.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 453
97.8%
Decimal Number 10
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
4.2%
18
 
4.0%
18
 
4.0%
15
 
3.3%
14
 
3.1%
14
 
3.1%
14
 
3.1%
13
 
2.9%
11
 
2.4%
11
 
2.4%
Other values (98) 306
67.5%
Decimal Number
ValueCountFrequency (%)
3 8
80.0%
4 2
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 453
97.8%
Common 10
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
4.2%
18
 
4.0%
18
 
4.0%
15
 
3.3%
14
 
3.1%
14
 
3.1%
14
 
3.1%
13
 
2.9%
11
 
2.4%
11
 
2.4%
Other values (98) 306
67.5%
Common
ValueCountFrequency (%)
3 8
80.0%
4 2
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 453
97.8%
ASCII 10
 
2.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
19
 
4.2%
18
 
4.0%
18
 
4.0%
15
 
3.3%
14
 
3.1%
14
 
3.1%
14
 
3.1%
13
 
2.9%
11
 
2.4%
11
 
2.4%
Other values (98) 306
67.5%
ASCII
ValueCountFrequency (%)
3 8
80.0%
4 2
 
20.0%

환승노선
Categorical

Distinct19
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2호선
20 
5호선
15 
1호선
13 
경의중앙선
13 
3호선
11 
Other values (14)
68 

Length

Max length6
Median length3
Mean length3.4642857
Min length2

Unique

Unique4 ?
Unique (%)2.9%

Sample

1st row4호선
2nd row공항철도
3rd row경의중앙선
4th row2호선
5th row3호선

Common Values

ValueCountFrequency (%)
2호선 20
14.3%
5호선 15
10.7%
1호선 13
9.3%
경의중앙선 13
9.3%
3호선 11
7.9%
6호선 10
7.1%
4호선 9
 
6.4%
9호선 9
 
6.4%
수인분당선 9
 
6.4%
7호선 7
 
5.0%
Other values (9) 24
17.1%

Length

2024-04-30T01:49:05.551796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2호선 20
14.3%
5호선 15
10.7%
1호선 13
9.3%
경의중앙선 13
9.3%
3호선 11
7.9%
6호선 10
7.1%
4호선 9
 
6.4%
9호선 9
 
6.4%
수인분당선 9
 
6.4%
공항철도 7
 
5.0%
Other values (9) 24
17.1%

환승거리(m)
Real number (ℝ)

Distinct77
Distinct (%)55.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean133.12143
Minimum7
Maximum355
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2024-04-30T01:49:05.659765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile34.4
Q177.75
median107
Q3181
95-th percentile279.35
Maximum355
Range348
Interquartile range (IQR)103.25

Descriptive statistics

Standard deviation78.918177
Coefficient of variation (CV)0.5928285
Kurtosis-0.10541884
Mean133.12143
Median Absolute Deviation (MAD)48
Skewness0.78442607
Sum18637
Variance6228.0787
MonotonicityNot monotonic
2024-04-30T01:49:05.777253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
110 5
 
3.6%
159 4
 
2.9%
77 4
 
2.9%
35 4
 
2.9%
155 4
 
2.9%
100 4
 
2.9%
82 4
 
2.9%
75 4
 
2.9%
93 3
 
2.1%
81 3
 
2.1%
Other values (67) 101
72.1%
ValueCountFrequency (%)
7 1
 
0.7%
16 1
 
0.7%
17 3
2.1%
19 1
 
0.7%
23 1
 
0.7%
35 4
2.9%
41 1
 
0.7%
44 1
 
0.7%
45 2
1.4%
46 2
1.4%
ValueCountFrequency (%)
355 1
0.7%
323 1
0.7%
314 1
0.7%
312 2
1.4%
309 1
0.7%
305 1
0.7%
278 2
1.4%
276 2
1.4%
275 1
0.7%
273 1
0.7%
Distinct70
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2024-04-30T01:49:06.023848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.7857143
Min length4

Characters and Unicode

Total characters670
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)22.9%

Sample

1st row2분13초
2nd row4분18초
3rd row2분17초
4th row1분24초
5th row1분38초
ValueCountFrequency (%)
1분8초 7
 
5.0%
1분23초 6
 
4.3%
1분32초 5
 
3.6%
1분18초 5
 
3.6%
2분13초 4
 
2.9%
1분4초 4
 
2.9%
0분29초 4
 
2.9%
2분9초 4
 
2.9%
1분38초 4
 
2.9%
1분3초 4
 
2.9%
Other values (60) 93
66.4%
2024-04-30T01:49:06.340963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
140
20.9%
140
20.9%
1 96
14.3%
2 86
12.8%
3 55
 
8.2%
4 42
 
6.3%
8 32
 
4.8%
0 31
 
4.6%
5 19
 
2.8%
9 14
 
2.1%
Other values (2) 15
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 390
58.2%
Other Letter 280
41.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 96
24.6%
2 86
22.1%
3 55
14.1%
4 42
10.8%
8 32
 
8.2%
0 31
 
7.9%
5 19
 
4.9%
9 14
 
3.6%
7 11
 
2.8%
6 4
 
1.0%
Other Letter
ValueCountFrequency (%)
140
50.0%
140
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 390
58.2%
Hangul 280
41.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 96
24.6%
2 86
22.1%
3 55
14.1%
4 42
10.8%
8 32
 
8.2%
0 31
 
7.9%
5 19
 
4.9%
9 14
 
3.6%
7 11
 
2.8%
6 4
 
1.0%
Hangul
ValueCountFrequency (%)
140
50.0%
140
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 390
58.2%
Hangul 280
41.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
140
50.0%
140
50.0%
ASCII
ValueCountFrequency (%)
1 96
24.6%
2 86
22.1%
3 55
14.1%
4 42
10.8%
8 32
 
8.2%
0 31
 
7.9%
5 19
 
4.9%
9 14
 
3.6%
7 11
 
2.8%
6 4
 
1.0%

Interactions

2024-04-30T01:49:04.027561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.517295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.775202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:04.107450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.600499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.855340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:04.202506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.688429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:49:03.938898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T01:49:06.424544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선환승역명환승노선환승거리(m)환승소요시간(초)
연번1.0000.9530.8540.0000.1950.000
호선0.9531.0000.8520.0000.2050.000
환승역명0.8540.8521.0000.9000.8970.991
환승노선0.0000.0000.9001.0000.0370.882
환승거리(m)0.1950.2050.8970.0371.0001.000
환승소요시간(초)0.0000.0000.9910.8821.0001.000
2024-04-30T01:49:06.522153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선환승거리(m)환승노선
연번1.0000.990-0.1600.000
호선0.9901.000-0.1650.000
환승거리(m)-0.160-0.1651.0000.000
환승노선0.0000.0000.0001.000

Missing values

2024-04-30T01:49:04.325526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T01:49:04.430583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선환승역명환승노선환승거리(m)환승소요시간(초)
011서울역4호선1592분13초
121서울역공항철도3094분18초
231서울역경의중앙선1642분17초
341시청2호선1011분24초
451종로3가3호선1181분38초
561종로3가5호선3124분20초
671동대문4호선1942분42초
781동묘앞6호선961분20초
891신설동2호선1592분13초
9101신설동우이신설선1291분48초
연번호선환승역명환승노선환승거리(m)환승소요시간(초)
1301318천호5호선350분29초
1311328석촌9호선821분8초
1321338잠실2호선1902분38초
1331348가락시장3호선350분29초
1341358복정수인분당선160분13초
1351368모란수인분당선991분23초
1361379선정릉수인분당선1101분32초
1371389종합운동장2호선941분18초
1381399석촌8호선821분8초
1391409올림픽공원5호선931분18초