Overview

Dataset statistics

Number of variables5
Number of observations285
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.4 KiB
Average record size in memory44.5 B

Variable types

Numeric4
Text1

Dataset

Description서울교통공사의 일평균 하차순위 데이터입니다. 해당 데이터는 하차순위, 호선, 역번호, 역명, 하차인원(명/일) 데이터를 포함하고 있습니다.
URLhttps://www.data.go.kr/data/15044248/fileData.do

Alerts

순위 is highly overall correlated with 일평균하차인원수High correlation
호선 is highly overall correlated with 역번호High correlation
역번호 is highly overall correlated with 호선High correlation
일평균하차인원수 is highly overall correlated with 순위High correlation
순위 has unique valuesUnique
역번호 has unique valuesUnique
역명 has unique valuesUnique
일평균하차인원수 has unique valuesUnique

Reproduction

Analysis started2023-12-12 16:19:57.615235
Analysis finished2023-12-12 16:19:59.488000
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순위
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean143
Minimum1
Maximum285
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T01:19:59.557559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.2
Q172
median143
Q3214
95-th percentile270.8
Maximum285
Range284
Interquartile range (IQR)142

Descriptive statistics

Standard deviation82.416625
Coefficient of variation (CV)0.57634003
Kurtosis-1.2
Mean143
Median Absolute Deviation (MAD)71
Skewness0
Sum40755
Variance6792.5
MonotonicityStrictly increasing
2023-12-13T01:19:59.699281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
189 1
 
0.4%
195 1
 
0.4%
194 1
 
0.4%
193 1
 
0.4%
192 1
 
0.4%
191 1
 
0.4%
190 1
 
0.4%
188 1
 
0.4%
197 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
285 1
0.4%
284 1
0.4%
283 1
0.4%
282 1
0.4%
281 1
0.4%
280 1
0.4%
279 1
0.4%
278 1
0.4%
277 1
0.4%
276 1
0.4%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8070175
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T01:19:59.843840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.165987
Coefficient of variation (CV)0.45058855
Kurtosis-0.98900904
Mean4.8070175
Median Absolute Deviation (MAD)2
Skewness0.064358114
Sum1370
Variance4.6914999
MonotonicityNot monotonic
2023-12-13T01:19:59.986372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
5 56
19.6%
2 50
17.5%
7 42
14.7%
6 37
13.0%
3 33
11.6%
4 26
9.1%
8 18
 
6.3%
9 13
 
4.6%
1 10
 
3.5%
ValueCountFrequency (%)
1 10
 
3.5%
2 50
17.5%
3 33
11.6%
4 26
9.1%
5 56
19.6%
6 37
13.0%
7 42
14.7%
8 18
 
6.3%
9 13
 
4.6%
ValueCountFrequency (%)
9 13
 
4.6%
8 18
 
6.3%
7 42
14.7%
6 37
13.0%
5 56
19.6%
4 26
9.1%
3 33
11.6%
2 50
17.5%
1 10
 
3.5%

역번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1730.4456
Minimum150
Maximum4138
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T01:20:00.156836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile205.2
Q1320
median2534
Q32712
95-th percentile2826.8
Maximum4138
Range3988
Interquartile range (IQR)2392

Descriptive statistics

Standard deviation1262.5495
Coefficient of variation (CV)0.72960947
Kurtosis-1.5156214
Mean1730.4456
Median Absolute Deviation (MAD)284
Skewness-0.10121826
Sum493177
Variance1594031.2
MonotonicityNot monotonic
2023-12-13T01:20:00.288982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
222 1
 
0.4%
2742 1
 
0.4%
2637 1
 
0.4%
2719 1
 
0.4%
340 1
 
0.4%
2746 1
 
0.4%
2626 1
 
0.4%
2644 1
 
0.4%
330 1
 
0.4%
2636 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
150 1
0.4%
151 1
0.4%
152 1
0.4%
153 1
0.4%
154 1
0.4%
155 1
0.4%
156 1
0.4%
157 1
0.4%
158 1
0.4%
159 1
0.4%
ValueCountFrequency (%)
4138 1
0.4%
4137 1
0.4%
4136 1
0.4%
4135 1
0.4%
4134 1
0.4%
4133 1
0.4%
4132 1
0.4%
4131 1
0.4%
4130 1
0.4%
4129 1
0.4%

역명
Text

UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-13T01:20:00.655839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length3.877193
Min length2

Characters and Unicode

Total characters1105
Distinct characters223
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique285 ?
Unique (%)100.0%

Sample

1st row강남
2nd row잠실(2)
3rd row홍대입구
4th row신림
5th row구로디지털단지
ValueCountFrequency (%)
강남 1
 
0.4%
가락시장(8 1
 
0.4%
동묘앞(6 1
 
0.4%
태릉입구(7 1
 
0.4%
가락시장(3 1
 
0.4%
대림(7 1
 
0.4%
대흥 1
 
0.4%
돌곶이 1
 
0.4%
교대(3 1
 
0.4%
고덕 1
 
0.4%
Other values (275) 275
96.5%
2023-12-13T01:20:01.153674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 87
 
7.9%
) 87
 
7.9%
32
 
2.9%
28
 
2.5%
23
 
2.1%
22
 
2.0%
19
 
1.7%
5 18
 
1.6%
2 16
 
1.4%
16
 
1.4%
Other values (213) 757
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 837
75.7%
Decimal Number 94
 
8.5%
Open Punctuation 87
 
7.9%
Close Punctuation 87
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Decimal Number
ValueCountFrequency (%)
5 18
19.1%
2 16
17.0%
3 14
14.9%
7 11
11.7%
6 11
11.7%
4 9
9.6%
8 6
 
6.4%
1 6
 
6.4%
9 3
 
3.2%
Open Punctuation
ValueCountFrequency (%)
( 87
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 837
75.7%
Common 268
 
24.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Common
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
7 11
 
4.1%
6 11
 
4.1%
4 9
 
3.4%
8 6
 
2.2%
1 6
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 837
75.7%
ASCII 268
 
24.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
7 11
 
4.1%
6 11
 
4.1%
4 9
 
3.4%
8 6
 
2.2%
1 6
 
2.2%
Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%

일평균하차인원수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13699.674
Minimum323
Maximum70404
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T01:20:01.321631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum323
5-th percentile1704.4
Q16240
median10852
Q317952
95-th percentile36127.8
Maximum70404
Range70081
Interquartile range (IQR)11712

Descriptive statistics

Standard deviation11220.027
Coefficient of variation (CV)0.81899961
Kurtosis5.4641887
Mean13699.674
Median Absolute Deviation (MAD)5187
Skewness2.0110836
Sum3904407
Variance1.2588901 × 108
MonotonicityStrictly decreasing
2023-12-13T01:20:01.460753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70404 1
 
0.4%
7967 1
 
0.4%
7669 1
 
0.4%
7762 1
 
0.4%
7817 1
 
0.4%
7858 1
 
0.4%
7912 1
 
0.4%
7938 1
 
0.4%
7974 1
 
0.4%
7632 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
323 1
0.4%
658 1
0.4%
716 1
0.4%
977 1
0.4%
1016 1
0.4%
1164 1
0.4%
1257 1
0.4%
1274 1
0.4%
1365 1
0.4%
1370 1
0.4%
ValueCountFrequency (%)
70404 1
0.4%
67651 1
0.4%
61255 1
0.4%
52910 1
0.4%
50452 1
0.4%
46905 1
0.4%
46331 1
0.4%
44051 1
0.4%
42521 1
0.4%
42042 1
0.4%

Interactions

2023-12-13T01:19:58.890816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:57.804334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.134928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.495879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.986104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:57.880639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.214264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.572928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:59.111322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:57.965573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.304295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.665486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:59.221727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.053407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.399815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:19:58.769060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:20:01.567013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역번호일평균하차인원수
순위1.0000.5860.6710.925
호선0.5861.0000.9410.429
역번호0.6710.9411.0000.477
일평균하차인원수0.9250.4290.4771.000
2023-12-13T01:20:01.666149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역번호일평균하차인원수
순위1.0000.4340.458-1.000
호선0.4341.0000.989-0.434
역번호0.4580.9891.000-0.458
일평균하차인원수-1.000-0.434-0.4581.000

Missing values

2023-12-13T01:19:59.351781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:19:59.452056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순위호선역번호역명일평균하차인원수
012222강남70404
122216잠실(2)67651
232239홍대입구61255
342230신림52910
452232구로디지털단지50452
562221역삼46905
672219삼성46331
782234신도림44051
893329고속터미널(3)42521
9102228서울대입구42042
순위호선역번호역명일평균하차인원수
27527694134송파나루1370
27627794136올림픽공원(9)1365
27727894132석촌고분1274
2782792247도림천1257
27928094128삼성중앙1164
2802814434남태령1016
28128272711장암977
28228394135한성백제716
28328494130종합운동장(9)658
28428594137둔촌오륜323