Overview

Dataset statistics

Number of variables5
Number of observations285
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.4 KiB
Average record size in memory44.5 B

Variable types

Numeric4
Text1

Dataset

Description서울교통공사 연간수송인원 수송순위에 대한 데이터 입니다. 이 데이터는 수송순위, 역명, 연간수송인원(명) 데이터를 제공합니다.
Author서울교통공사
URLhttps://www.data.go.kr/data/15044243/fileData.do

Alerts

순위 is highly overall correlated with 연간수송인원(명)High correlation
호선 is highly overall correlated with 역번호High correlation
역번호 is highly overall correlated with 호선High correlation
연간수송인원(명) is highly overall correlated with 순위High correlation
순위 has unique valuesUnique
역번호 has unique valuesUnique
역명 has unique valuesUnique
연간수송인원(명) has unique valuesUnique

Reproduction

Analysis started2024-04-21 01:10:23.763644
Analysis finished2024-04-21 01:10:26.698475
Duration2.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순위
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean143
Minimum1
Maximum285
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-04-21T10:10:26.762408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.2
Q172
median143
Q3214
95-th percentile270.8
Maximum285
Range284
Interquartile range (IQR)142

Descriptive statistics

Standard deviation82.416625
Coefficient of variation (CV)0.57634003
Kurtosis-1.2
Mean143
Median Absolute Deviation (MAD)71
Skewness0
Sum40755
Variance6792.5
MonotonicityStrictly increasing
2024-04-21T10:10:26.874412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
189 1
 
0.4%
195 1
 
0.4%
194 1
 
0.4%
193 1
 
0.4%
192 1
 
0.4%
191 1
 
0.4%
190 1
 
0.4%
188 1
 
0.4%
197 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
285 1
0.4%
284 1
0.4%
283 1
0.4%
282 1
0.4%
281 1
0.4%
280 1
0.4%
279 1
0.4%
278 1
0.4%
277 1
0.4%
276 1
0.4%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8070175
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-04-21T10:10:26.968852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.165987
Coefficient of variation (CV)0.45058855
Kurtosis-0.98900904
Mean4.8070175
Median Absolute Deviation (MAD)2
Skewness0.064358114
Sum1370
Variance4.6914999
MonotonicityNot monotonic
2024-04-21T10:10:27.076107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
5 56
19.6%
2 50
17.5%
7 42
14.7%
6 37
13.0%
3 33
11.6%
4 26
9.1%
8 18
 
6.3%
9 13
 
4.6%
1 10
 
3.5%
ValueCountFrequency (%)
1 10
 
3.5%
2 50
17.5%
3 33
11.6%
4 26
9.1%
5 56
19.6%
6 37
13.0%
7 42
14.7%
8 18
 
6.3%
9 13
 
4.6%
ValueCountFrequency (%)
9 13
 
4.6%
8 18
 
6.3%
7 42
14.7%
6 37
13.0%
5 56
19.6%
4 26
9.1%
3 33
11.6%
2 50
17.5%
1 10
 
3.5%

역번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1730.4456
Minimum150
Maximum4138
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-04-21T10:10:27.193088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile205.2
Q1320
median2534
Q32712
95-th percentile2826.8
Maximum4138
Range3988
Interquartile range (IQR)2392

Descriptive statistics

Standard deviation1262.5495
Coefficient of variation (CV)0.72960947
Kurtosis-1.5156214
Mean1730.4456
Median Absolute Deviation (MAD)284
Skewness-0.10121826
Sum493177
Variance1594031.2
MonotonicityNot monotonic
2024-04-21T10:10:27.310309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
239 1
 
0.4%
2641 1
 
0.4%
2626 1
 
0.4%
2553 1
 
0.4%
325 1
 
0.4%
2752 1
 
0.4%
2648 1
 
0.4%
2712 1
 
0.4%
2817 1
 
0.4%
243 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
150 1
0.4%
151 1
0.4%
152 1
0.4%
153 1
0.4%
154 1
0.4%
155 1
0.4%
156 1
0.4%
157 1
0.4%
158 1
0.4%
159 1
0.4%
ValueCountFrequency (%)
4138 1
0.4%
4137 1
0.4%
4136 1
0.4%
4135 1
0.4%
4134 1
0.4%
4133 1
0.4%
4132 1
0.4%
4131 1
0.4%
4130 1
0.4%
4129 1
0.4%

역명
Text

UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-04-21T10:10:27.567353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length3.877193
Min length2

Characters and Unicode

Total characters1105
Distinct characters223
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique285 ?
Unique (%)100.0%

Sample

1st row홍대입구
2nd row잠실(2)
3rd row강남
4th row서울역(1)
5th row구로디지털단지
ValueCountFrequency (%)
홍대입구 1
 
0.4%
동묘앞(6 1
 
0.4%
대흥 1
 
0.4%
고덕 1
 
0.4%
옥수 1
 
0.4%
온수(7 1
 
0.4%
봉화산 1
 
0.4%
도봉산(7 1
 
0.4%
송파 1
 
0.4%
거여 1
 
0.4%
Other values (275) 275
96.5%
2024-04-21T10:10:27.941903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 87
 
7.9%
) 87
 
7.9%
32
 
2.9%
28
 
2.5%
23
 
2.1%
22
 
2.0%
19
 
1.7%
5 18
 
1.6%
2 16
 
1.4%
16
 
1.4%
Other values (213) 757
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 837
75.7%
Decimal Number 94
 
8.5%
Open Punctuation 87
 
7.9%
Close Punctuation 87
 
7.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Decimal Number
ValueCountFrequency (%)
5 18
19.1%
2 16
17.0%
3 14
14.9%
7 11
11.7%
6 11
11.7%
4 9
9.6%
1 6
 
6.4%
8 6
 
6.4%
9 3
 
3.2%
Open Punctuation
ValueCountFrequency (%)
( 87
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 837
75.7%
Common 268
 
24.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%
Common
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
7 11
 
4.1%
6 11
 
4.1%
4 9
 
3.4%
1 6
 
2.2%
8 6
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 837
75.7%
ASCII 268
 
24.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 87
32.5%
) 87
32.5%
5 18
 
6.7%
2 16
 
6.0%
3 14
 
5.2%
7 11
 
4.1%
6 11
 
4.1%
4 9
 
3.4%
1 6
 
2.2%
8 6
 
2.2%
Hangul
ValueCountFrequency (%)
32
 
3.8%
28
 
3.3%
23
 
2.7%
22
 
2.6%
19
 
2.3%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
Other values (202) 639
76.3%

연간수송인원(명)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8482665.7
Minimum628118
Maximum39441541
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-04-21T10:10:28.076097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum628118
5-th percentile1858798
Q14222218
median6832157
Q310606584
95-th percentile22611032
Maximum39441541
Range38813423
Interquartile range (IQR)6384366

Descriptive statistics

Standard deviation6531478.2
Coefficient of variation (CV)0.76997944
Kurtosis4.7467445
Mean8482665.7
Median Absolute Deviation (MAD)2976962
Skewness1.9399669
Sum2.4175597 × 109
Variance4.2660207 × 1013
MonotonicityStrictly decreasing
2024-04-21T10:10:28.206694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39441541 1
 
0.4%
5064988 1
 
0.4%
4974809 1
 
0.4%
5010191 1
 
0.4%
5032924 1
 
0.4%
5042731 1
 
0.4%
5062633 1
 
0.4%
5064436 1
 
0.4%
5179956 1
 
0.4%
4872939 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
628118 1
0.4%
702166 1
0.4%
713074 1
0.4%
781113 1
0.4%
1057939 1
0.4%
1106808 1
0.4%
1120565 1
0.4%
1244658 1
0.4%
1397469 1
0.4%
1406953 1
0.4%
ValueCountFrequency (%)
39441541 1
0.4%
37873306 1
0.4%
37224977 1
0.4%
31086993 1
0.4%
28687464 1
0.4%
27296469 1
0.4%
27142292 1
0.4%
26726964 1
0.4%
26516158 1
0.4%
25489268 1
0.4%

Interactions

2024-04-21T10:10:26.215285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.136022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.490616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.839324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:26.296089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.255362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.564862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.912992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:26.379583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.333546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.645314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.995081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:26.462010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.409578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:25.735105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:10:26.112423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T10:10:28.296948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역번호연간수송인원(명)
순위1.0000.3870.4470.866
호선0.3871.0000.9410.544
역번호0.4470.9411.0000.379
연간수송인원(명)0.8660.5440.3791.000
2024-04-21T10:10:28.376979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위호선역번호연간수송인원(명)
순위1.0000.3580.384-1.000
호선0.3581.0000.989-0.358
역번호0.3840.9891.000-0.384
연간수송인원(명)-1.000-0.358-0.3841.000

Missing values

2024-04-21T10:10:26.581209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T10:10:26.661323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순위호선역번호역명연간수송인원(명)
012239홍대입구39441541
122216잠실(2)37873306
232222강남37224977
341150서울역(1)31086993
452232구로디지털단지28687464
562230신림27296469
672219삼성27142292
783329고속터미널(3)26726964
892234신도림26516158
9102221역삼25489268
순위호선역번호역명연간수송인원(명)
2752763336학여울1406953
2762772244용답1397469
27727862633버티고개1244658
2782792250용두1120565
27928072711장암1106808
2802814431동작1057939
2812824434남태령781113
2822832247도림천713074
2832842245신답702166
28428594137둔촌오륜628118