Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 664.1 KiB |
Average record size in memory | 68.0 B |
Variable types
Numeric | 4 |
---|---|
Text | 1 |
Categorical | 2 |
Dataset
Description | 서울교통공사의 권종별(선불, 후불, 정기권, 우대권, 1회권, 단체권) 승차인원(월별, 역별, 호선별) 데이터입니다. 2022년 12월 데이터까지 업데이트 합니다. |
---|---|
URL | https://www.data.go.kr/data/15044254/fileData.do |
연번 is highly overall correlated with 수송연월 | High correlation |
호선 is highly overall correlated with 고유역번호(외부역코드) | High correlation |
고유역번호(외부역코드) is highly overall correlated with 호선 | High correlation |
수송연월 is highly overall correlated with 연번 | High correlation |
연번 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 07:04:57.649897 |
---|---|
Analysis finished | 2023-12-12 07:05:00.615852 |
Duration | 2.97 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 10321.404 |
Minimum | 3 |
---|---|
Maximum | 20520 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 1070.9 |
Q1 | 5183.75 |
median | 10341.5 |
Q3 | 15454.25 |
95-th percentile | 19573.05 |
Maximum | 20520 |
Range | 20517 |
Interquartile range (IQR) | 10270.5 |
Descriptive statistics
Standard deviation | 5927.4076 |
---|---|
Coefficient of variation (CV) | 0.57428306 |
Kurtosis | -1.1996573 |
Mean | 10321.404 |
Median Absolute Deviation (MAD) | 5137 |
Skewness | -0.0099087209 |
Sum | 1.0321404 × 108 |
Variance | 35134161 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
18964 | 1 | < 0.1% |
1729 | 1 | < 0.1% |
249 | 1 | < 0.1% |
1544 | 1 | < 0.1% |
7634 | 1 | < 0.1% |
7346 | 1 | < 0.1% |
4623 | 1 | < 0.1% |
7024 | 1 | < 0.1% |
17858 | 1 | < 0.1% |
11061 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
13 | 1 | |
15 | 1 | |
17 | 1 | |
18 | 1 |
Value | Count | Frequency (%) |
20520 | 1 | |
20518 | 1 | |
20517 | 1 | |
20514 | 1 | |
20513 | 1 | |
20510 | 1 | |
20508 | 1 | |
20507 | 1 | |
20506 | 1 | |
20505 | 1 |
호선
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.804 |
Minimum | 1 |
---|---|
Maximum | 9 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2 |
Q1 | 3 |
median | 5 |
Q3 | 7 |
95-th percentile | 8 |
Maximum | 9 |
Range | 8 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.1585759 |
---|---|
Coefficient of variation (CV) | 0.44932887 |
Kurtosis | -0.98593469 |
Mean | 4.804 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 0.066218605 |
Sum | 48040 |
Variance | 4.6594499 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 1964 | |
2 | 1726 | |
7 | 1496 | |
6 | 1297 | |
3 | 1198 | |
4 | 902 | |
8 | 601 | 6.0% |
9 | 461 | 4.6% |
1 | 355 | 3.5% |
Value | Count | Frequency (%) |
1 | 355 | 3.5% |
2 | 1726 | |
3 | 1198 | |
4 | 902 | |
5 | 1964 | |
6 | 1297 | |
7 | 1496 | |
8 | 601 | 6.0% |
9 | 461 | 4.6% |
Value | Count | Frequency (%) |
9 | 461 | 4.6% |
8 | 601 | 6.0% |
7 | 1496 | |
6 | 1297 | |
5 | 1964 | |
4 | 902 | |
3 | 1198 | |
2 | 1726 | |
1 | 355 | 3.5% |
고유역번호(외부역코드)
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 285 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1729.5503 |
Minimum | 150 |
---|---|
Maximum | 4138 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 150 |
---|---|
5-th percentile | 205 |
Q1 | 319.75 |
median | 2534 |
Q3 | 2712 |
95-th percentile | 2827 |
Maximum | 4138 |
Range | 3988 |
Interquartile range (IQR) | 2392.25 |
Descriptive statistics
Standard deviation | 1261.1245 |
---|---|
Coefficient of variation (CV) | 0.72916325 |
Kurtosis | -1.5083272 |
Mean | 1729.5503 |
Median Absolute Deviation (MAD) | 284 |
Skewness | -0.09647139 |
Sum | 17295503 |
Variance | 1590435.1 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2647 | 47 | 0.5% |
2519 | 46 | 0.5% |
2542 | 45 | 0.4% |
2738 | 45 | 0.4% |
2642 | 44 | 0.4% |
2546 | 43 | 0.4% |
428 | 43 | 0.4% |
156 | 43 | 0.4% |
333 | 42 | 0.4% |
4135 | 42 | 0.4% |
Other values (275) | 9560 |
Value | Count | Frequency (%) |
150 | 32 | |
151 | 34 | |
152 | 35 | |
153 | 37 | |
154 | 38 | |
155 | 30 | |
156 | 43 | |
157 | 37 | |
158 | 30 | |
159 | 39 |
Value | Count | Frequency (%) |
4138 | 36 | |
4137 | 35 | |
4136 | 30 | |
4135 | 42 | |
4134 | 32 | |
4133 | 37 | |
4132 | 40 | |
4131 | 35 | |
4130 | 38 | |
4129 | 34 |
역명
Text
Distinct | 285 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
화랑대 | 47 | 0.5% |
까치산(5 | 46 | 0.5% |
마장 | 45 | 0.4% |
이수(7 | 45 | 0.4% |
월곡 | 44 | 0.4% |
신설동(1 | 43 | 0.4% |
아차산 | 43 | 0.4% |
삼각지(4 | 43 | 0.4% |
무악재 | 42 | 0.4% |
한성백제 | 42 | 0.4% |
Other values (275) | 9560 |
Most occurring characters
Value | Count | Frequency (%) |
( | 3016 | 7.8% |
) | 3016 | 7.8% |
대 | 1122 | 2.9% |
구 | 1003 | 2.6% |
동 | 826 | 2.1% |
신 | 753 | 1.9% |
산 | 693 | 1.8% |
5 | 651 | 1.7% |
원 | 571 | 1.5% |
2 | 541 | 1.4% |
Other values (213) | 26449 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 29356 | |
Decimal Number | 3253 | 8.4% |
Open Punctuation | 3016 | 7.8% |
Close Punctuation | 3016 | 7.8% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
대 | 1122 | 3.8% |
구 | 1003 | 3.4% |
동 | 826 | 2.8% |
신 | 753 | 2.6% |
산 | 693 | 2.4% |
원 | 571 | 1.9% |
지 | 523 | 1.8% |
문 | 503 | 1.7% |
청 | 498 | 1.7% |
입 | 480 | 1.6% |
Other values (202) | 22384 |
Decimal Number
Value | Count | Frequency (%) |
5 | 651 | |
2 | 541 | |
3 | 480 | |
7 | 385 | |
6 | 367 | |
4 | 316 | |
1 | 215 | 6.6% |
8 | 193 | 5.9% |
9 | 105 | 3.2% |
Open Punctuation
Value | Count | Frequency (%) |
( | 3016 |
Close Punctuation
Value | Count | Frequency (%) |
) | 3016 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 29356 | |
Common | 9285 | 24.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
대 | 1122 | 3.8% |
구 | 1003 | 3.4% |
동 | 826 | 2.8% |
신 | 753 | 2.6% |
산 | 693 | 2.4% |
원 | 571 | 1.9% |
지 | 523 | 1.8% |
문 | 503 | 1.7% |
청 | 498 | 1.7% |
입 | 480 | 1.6% |
Other values (202) | 22384 |
Common
Value | Count | Frequency (%) |
( | 3016 | |
) | 3016 | |
5 | 651 | 7.0% |
2 | 541 | 5.8% |
3 | 480 | 5.2% |
7 | 385 | 4.1% |
6 | 367 | 4.0% |
4 | 316 | 3.4% |
1 | 215 | 2.3% |
8 | 193 | 2.1% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 29356 | |
ASCII | 9285 | 24.0% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
( | 3016 | |
) | 3016 | |
5 | 651 | 7.0% |
2 | 541 | 5.8% |
3 | 480 | 5.2% |
7 | 385 | 4.1% |
6 | 367 | 4.0% |
4 | 316 | 3.4% |
1 | 215 | 2.3% |
8 | 193 | 2.1% |
Hangul
Value | Count | Frequency (%) |
대 | 1122 | 3.8% |
구 | 1003 | 3.4% |
동 | 826 | 2.8% |
신 | 753 | 2.6% |
산 | 693 | 2.4% |
원 | 571 | 1.9% |
지 | 523 | 1.8% |
문 | 503 | 1.7% |
청 | 498 | 1.7% |
입 | 480 | 1.6% |
Other values (202) | 22384 |
수송연월
Categorical
HIGH CORRELATION
 
Distinct | 12 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
2022-09 | |
---|---|
2022-10 | |
2022-12 | |
2022-02 | |
2022-11 | |
Other values (7) |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 7 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2022-12 |
---|---|
2nd row | 2022-02 |
3rd row | 2022-12 |
4th row | 2022-10 |
5th row | 2022-12 |
Common Values
Value | Count | Frequency (%) |
2022-09 | 863 | |
2022-10 | 848 | |
2022-12 | 843 | |
2022-02 | 839 | |
2022-11 | 838 | |
2022-06 | 834 | |
2022-03 | 830 | |
2022-07 | 828 | |
2022-05 | 827 | |
2022-08 | 823 | |
Other values (2) | 1627 |
Length
Value | Count | Frequency (%) |
2022-09 | 863 | |
2022-10 | 848 | |
2022-12 | 843 | |
2022-02 | 839 | |
2022-11 | 838 | |
2022-06 | 834 | |
2022-03 | 830 | |
2022-07 | 828 | |
2022-05 | 827 | |
2022-08 | 823 | |
Other values (2) | 1627 |
권종
Categorical
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
정기권 | |
---|---|
1회권 | |
우대권 | |
선불 | |
단체권 |
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 2.6709 |
Min length | 2 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 선불 |
---|---|
2nd row | 후불 |
3rd row | 선불 |
4th row | 후불 |
5th row | 선불 |
Common Values
Value | Count | Frequency (%) |
정기권 | 1694 | |
1회권 | 1694 | |
우대권 | 1665 | |
선불 | 1662 | |
단체권 | 1656 | |
후불 | 1629 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
정기권 | 1694 | |
1회권 | 1694 | |
우대권 | 1665 | |
선불 | 1662 | |
단체권 | 1656 | |
후불 | 1629 |
승차인원수
Real number (ℝ)
Distinct | 8585 |
---|---|
Distinct (%) | 85.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 70171.198 |
Minimum | 0 |
---|---|
Maximum | 1611119 |
Zeros | 73 |
Zeros (%) | 0.7% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 699.95 |
Q1 | 2425.25 |
median | 16639.5 |
Q3 | 81729.75 |
95-th percentile | 299199.4 |
Maximum | 1611119 |
Range | 1611119 |
Interquartile range (IQR) | 79304.5 |
Descriptive statistics
Standard deviation | 131932.13 |
---|---|
Coefficient of variation (CV) | 1.8801464 |
Kurtosis | 28.071962 |
Mean | 70171.198 |
Median Absolute Deviation (MAD) | 15829.5 |
Skewness | 4.3544723 |
Sum | 7.0171198 × 108 |
Variance | 1.7406087 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 73 | 0.7% |
1088 | 6 | 0.1% |
1722 | 5 | 0.1% |
757 | 5 | 0.1% |
1362 | 5 | 0.1% |
2744 | 5 | 0.1% |
659 | 5 | 0.1% |
1265 | 5 | 0.1% |
1774 | 5 | 0.1% |
1537 | 5 | 0.1% |
Other values (8575) | 9881 |
Value | Count | Frequency (%) |
0 | 73 | |
87 | 1 | < 0.1% |
90 | 1 | < 0.1% |
103 | 1 | < 0.1% |
108 | 1 | < 0.1% |
116 | 1 | < 0.1% |
127 | 2 | < 0.1% |
129 | 1 | < 0.1% |
132 | 1 | < 0.1% |
133 | 2 | < 0.1% |
Value | Count | Frequency (%) |
1611119 | 1 | |
1610601 | 1 | |
1524155 | 1 | |
1520522 | 1 | |
1508910 | 1 | |
1473243 | 1 | |
1448127 | 1 | |
1442599 | 1 | |
1432465 | 1 | |
1427873 | 1 |
연번 | 호선 | 고유역번호(외부역코드) | 수송연월 | 권종 | 승차인원수 | |
---|---|---|---|---|---|---|
연번 | 1.000 | 0.010 | 0.009 | 0.959 | 0.237 | 0.041 |
호선 | 0.010 | 1.000 | 0.942 | 0.000 | 0.000 | 0.186 |
고유역번호(외부역코드) | 0.009 | 0.942 | 1.000 | 0.000 | 0.000 | 0.176 |
수송연월 | 0.959 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 |
권종 | 0.237 | 0.000 | 0.000 | 0.000 | 1.000 | 0.500 |
승차인원수 | 0.041 | 0.186 | 0.176 | 0.000 | 0.500 | 1.000 |
권종 | 수송연월 | |
---|---|---|
권종 | 1.000 | 0.000 |
수송연월 | 0.000 | 1.000 |
연번 | 호선 | 고유역번호(외부역코드) | 승차인원수 | 수송연월 | 권종 | |
---|---|---|---|---|---|---|
연번 | 1.000 | 0.008 | 0.008 | -0.022 | 0.839 | 0.126 |
호선 | 0.008 | 1.000 | 0.989 | -0.139 | 0.000 | 0.000 |
고유역번호(외부역코드) | 0.008 | 0.989 | 1.000 | -0.148 | 0.000 | 0.000 |
승차인원수 | -0.022 | -0.139 | -0.148 | 1.000 | 0.000 | 0.291 |
수송연월 | 0.839 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 |
권종 | 0.126 | 0.000 | 0.000 | 0.291 | 0.000 | 1.000 |
연번 | 호선 | 고유역번호(외부역코드) | 역명 | 수송연월 | 권종 | 승차인원수 | |
---|---|---|---|---|---|---|---|
18963 | 18964 | 5 | 2545 | 군자(5) | 2022-12 | 선불 | 74133 |
2049 | 2050 | 2 | 245 | 신답 | 2022-02 | 후불 | 20657 |
18956 | 18957 | 5 | 2538 | 청구(5) | 2022-12 | 선불 | 22215 |
15782 | 15783 | 4 | 423 | 충무로(4) | 2022-10 | 후불 | 459185 |
18826 | 18827 | 2 | 207 | 상왕십리 | 2022-12 | 선불 | 89956 |
8764 | 8765 | 7 | 2713 | 수락산 | 2022-06 | 선불 | 58974 |
19420 | 19421 | 2 | 231 | 신대방 | 2022-12 | 정기권 | 30238 |
6052 | 6053 | 3 | 316 | 독립문 | 2022-04 | 우대권 | 63130 |
20415 | 20416 | 6 | 2617 | 새절 | 2022-12 | 단체권 | 1835 |
15261 | 15262 | 5 | 2548 | 천호(5) | 2022-09 | 단체권 | 1947 |
연번 | 호선 | 고유역번호(외부역코드) | 역명 | 수송연월 | 권종 | 승차인원수 | |
---|---|---|---|---|---|---|---|
3318 | 3319 | 6 | 2620 | 월드컵경기장 | 2022-02 | 단체권 | 2957 |
7086 | 7087 | 7 | 2745 | 신풍 | 2022-05 | 선불 | 70811 |
12754 | 12755 | 7 | 2713 | 수락산 | 2022-08 | 정기권 | 14963 |
8946 | 8947 | 4 | 427 | 숙대입구 | 2022-06 | 후불 | 241695 |
8589 | 8590 | 2 | 230 | 신림 | 2022-06 | 선불 | 356010 |
6313 | 6314 | 2 | 234 | 신도림 | 2022-04 | 1회권 | 4004 |
591 | 592 | 2 | 212 | 건대입구(2) | 2022-01 | 정기권 | 22827 |
9027 | 9028 | 6 | 2629 | 삼각지(6) | 2022-06 | 후불 | 145117 |
18811 | 18812 | 1 | 151 | 시청(1) | 2022-12 | 선불 | 193420 |
12832 | 12833 | 1 | 157 | 제기동 | 2022-08 | 우대권 | 261885 |