Overview

Dataset statistics

Number of variables5
Number of observations285
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.4 KiB
Average record size in memory44.5 B

Variable types

Numeric4
Text1

Dataset

Description서울교통공사의 일평균 승차순위에 대한 데이터 입니다. 이 데이터는 승차순위, 호선, 역번호, 역명, 일평균승차인원 데이터를 제공합니다.해당데이터는 2023년 12월 기준으로 업데이트 되었습니다.
Author서울교통공사
URLhttps://www.data.go.kr/data/15044252/fileData.do

Alerts

순위 is highly overall correlated with 일평균승차인원High correlation
호선 is highly overall correlated with 역코드High correlation
역코드 is highly overall correlated with 호선High correlation
일평균승차인원 is highly overall correlated with 순위High correlation
순위 has unique valuesUnique
역코드 has unique valuesUnique
역명 has unique valuesUnique

Reproduction

Analysis started2024-05-04 08:08:42.153237
Analysis finished2024-05-04 08:08:48.711947
Duration6.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순위
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean143
Minimum1
Maximum285
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-05-04T08:08:48.914428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.2
Q172
median143
Q3214
95-th percentile270.8
Maximum285
Range284
Interquartile range (IQR)142

Descriptive statistics

Standard deviation82.416625
Coefficient of variation (CV)0.57634003
Kurtosis-1.2
Mean143
Median Absolute Deviation (MAD)71
Skewness0
Sum40755
Variance6792.5
MonotonicityStrictly increasing
2024-05-04T08:08:49.474552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
189 1
 
0.4%
195 1
 
0.4%
194 1
 
0.4%
193 1
 
0.4%
192 1
 
0.4%
191 1
 
0.4%
190 1
 
0.4%
188 1
 
0.4%
197 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
285 1
0.4%
284 1
0.4%
283 1
0.4%
282 1
0.4%
281 1
0.4%
280 1
0.4%
279 1
0.4%
278 1
0.4%
277 1
0.4%
276 1
0.4%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8070175
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-05-04T08:08:49.841291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.165987
Coefficient of variation (CV)0.45058855
Kurtosis-0.98900904
Mean4.8070175
Median Absolute Deviation (MAD)2
Skewness0.064358114
Sum1370
Variance4.6914999
MonotonicityNot monotonic
2024-05-04T08:08:50.218566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
5 56
19.6%
2 50
17.5%
7 42
14.7%
6 37
13.0%
3 33
11.6%
4 26
9.1%
8 18
 
6.3%
9 13
 
4.6%
1 10
 
3.5%
ValueCountFrequency (%)
1 10
 
3.5%
2 50
17.5%
3 33
11.6%
4 26
9.1%
5 56
19.6%
6 37
13.0%
7 42
14.7%
8 18
 
6.3%
9 13
 
4.6%
ValueCountFrequency (%)
9 13
 
4.6%
8 18
 
6.3%
7 42
14.7%
6 37
13.0%
5 56
19.6%
4 26
9.1%
3 33
11.6%
2 50
17.5%
1 10
 
3.5%

역코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1730.4456
Minimum150
Maximum4138
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-05-04T08:08:50.653725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile205.2
Q1320
median2534
Q32712
95-th percentile2826.8
Maximum4138
Range3988
Interquartile range (IQR)2392

Descriptive statistics

Standard deviation1262.5495
Coefficient of variation (CV)0.72960947
Kurtosis-1.5156214
Mean1730.4456
Median Absolute Deviation (MAD)284
Skewness-0.10121826
Sum493177
Variance1594031.2
MonotonicityNot monotonic
2024-05-04T08:08:51.106125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
216 1
 
0.4%
2719 1
 
0.4%
325 1
 
0.4%
2712 1
 
0.4%
2746 1
 
0.4%
2821 1
 
0.4%
4133 1
 
0.4%
4127 1
 
0.4%
2551 1
 
0.4%
2742 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
150 1
0.4%
151 1
0.4%
152 1
0.4%
153 1
0.4%
154 1
0.4%
155 1
0.4%
156 1
0.4%
157 1
0.4%
158 1
0.4%
159 1
0.4%
ValueCountFrequency (%)
4138 1
0.4%
4137 1
0.4%
4136 1
0.4%
4135 1
0.4%
4134 1
0.4%
4133 1
0.4%
4132 1
0.4%
4131 1
0.4%
4130 1
0.4%
4129 1
0.4%

역명
Text

UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-05-04T08:08:51.737953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length3.877193
Min length2

Characters and Unicode

Total characters1105
Distinct characters223
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique285 ?
Unique (%)100.0%

Sample

1st row잠실(2)
2nd row강남
3rd row홍대입구
4th row구로디지털단지
5th row신림
ValueCountFrequency (%)
잠실(2 1
 
0.4%
송파 1
 
0.4%
옥수 1
 
0.4%
도봉산(7 1
 
0.4%
대림(7 1
 
0.4%
복정(8 1
 
0.4%
석촌(9 1
 
0.4%
선정릉 1
 
0.4%
굽은다리 1
 
0.4%
뚝섬유원지 1
 
0.4%
Other values (275) 275
96.5%
2024-05-04T08:08:52.804227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 87
 
7.9%
) 87
 
7.9%
32
 
2.9%
28
 
2.5%
23
 
2.1%
22
 
2.0%
19
 
1.7%
5 18
 
1.6%
16
 
1.4%
2 16
 
1.4%
Other values (213) 757
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 837
75.7%
Decimal Number 94
 
8.5%
Open Punctuation 87
 
7.9%
Close Punctuation 87
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Decimal Number
ValueCountFrequency (%)
5 18
19.1%
2 16
17.0%
3 14
14.9%
6 11
11.7%
7 11
11.7%
4 9
9.6%
8 6
 
6.4%
1 6
 
6.4%
9 3
 
3.2%
Open Punctuation
ValueCountFrequency (%)
( 87
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 837
75.7%
Common 268
 
24.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Common
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
6 11
 
4.1%
7 11
 
4.1%
4 9
 
3.4%
8 6
 
2.2%
1 6
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 837
75.7%
ASCII 268
 
24.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
6 11
 
4.1%
7 11
 
4.1%
4 9
 
3.4%
8 6
 
2.2%
1 6
 
2.2%
Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%

일평균승차인원
Real number (ℝ)

HIGH CORRELATION 

Distinct284
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15228.825
Minimum1050
Maximum76010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-05-04T08:08:53.225353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1050
5-th percentile3436
Q17530
median12168
Q318569
95-th percentile37634.2
Maximum76010
Range74960
Interquartile range (IQR)11039

Descriptive statistics

Standard deviation11914.193
Coefficient of variation (CV)0.78234488
Kurtosis6.1139591
Mean15228.825
Median Absolute Deviation (MAD)5333
Skewness2.1372001
Sum4340215
Variance1.4194799 × 108
MonotonicityDecreasing
2024-05-04T08:08:53.678742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9936 2
 
0.7%
76010 1
 
0.4%
9554 1
 
0.4%
8734 1
 
0.4%
8745 1
 
0.4%
8779 1
 
0.4%
8814 1
 
0.4%
8830 1
 
0.4%
8831 1
 
0.4%
8958 1
 
0.4%
Other values (274) 274
96.1%
ValueCountFrequency (%)
1050 1
0.4%
1213 1
0.4%
1359 1
0.4%
1594 1
0.4%
1833 1
0.4%
2084 1
0.4%
2367 1
0.4%
2406 1
0.4%
2477 1
0.4%
2647 1
0.4%
ValueCountFrequency (%)
76010 1
0.4%
74899 1
0.4%
67806 1
0.4%
53483 1
0.4%
53411 1
0.4%
51390 1
0.4%
51361 1
0.4%
50152 1
0.4%
48559 1
0.4%
47816 1
0.4%

Interactions

2024-05-04T08:08:47.002922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:42.765701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:44.313846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:45.855469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:47.299413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:43.095518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:44.628647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:46.160667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:47.598373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:43.508294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:44.920277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:46.447607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:47.869681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:43.928031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:45.255282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T08:08:46.724683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T08:08:53.933645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역코드일평균승차인원
순위1.0000.4280.4580.848
호선0.4281.0000.9410.531
역코드0.4580.9411.0000.378
일평균승차인원0.8480.5310.3781.000
2024-05-04T08:08:54.405351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역코드일평균승차인원
순위1.0000.3810.408-1.000
호선0.3811.0000.989-0.381
역코드0.4080.9891.000-0.408
일평균승차인원-1.000-0.381-0.4081.000

Missing values

2024-05-04T08:08:48.225150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T08:08:48.590453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순위호선역코드역명일평균승차인원
012216잠실(2)76010
122222강남74899
232239홍대입구67806
342232구로디지털단지53483
452230신림53411
561150서울역(1)51390
672219삼성51361
782220선릉50152
893329고속터미널(3)48559
9102234신도림47816
순위호선역코드역명일평균승차인원
27527662614독바위2647
27627772711장암2477
2772783336학여울2406
2782792250용두2367
27928062633버티고개2084
2802814431동작1833
2812822245신답1594
2822834434남태령1359
2832842247도림천1213
28428594137둔촌오륜1050