Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 3420 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 173.8 KiB |
Average record size in memory | 52.0 B |
Variable types
Numeric | 4 |
---|---|
Text | 1 |
Categorical | 1 |
Dataset
Description | 서울교통공사의 월별 수송인원 데이터입니다. 해당 데이터는 연번, 호선, 역번호, 월별 수송 인원 데이터로 구성되어 있습니다. 연단위 데이터로 2022년 12월기준 파일까지 업로드 합니다. |
---|---|
URL | https://www.data.go.kr/data/15044253/fileData.do |
연번 is highly overall correlated with 수송연월 | High correlation |
호선 is highly overall correlated with 고유역번호(외부역코드) | High correlation |
고유역번호(외부역코드) is highly overall correlated with 호선 | High correlation |
수송연월 is highly overall correlated with 연번 | High correlation |
연번 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 06:00:34.651547 |
---|---|
Analysis finished | 2023-12-12 06:00:37.484431 |
Duration | 2.83 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 3420 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1710.5 |
Minimum | 1 |
---|---|
Maximum | 3420 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 30.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 171.95 |
Q1 | 855.75 |
median | 1710.5 |
Q3 | 2565.25 |
95-th percentile | 3249.05 |
Maximum | 3420 |
Range | 3419 |
Interquartile range (IQR) | 1709.5 |
Descriptive statistics
Standard deviation | 987.41329 |
---|---|
Coefficient of variation (CV) | 0.57726588 |
Kurtosis | -1.2 |
Mean | 1710.5 |
Median Absolute Deviation (MAD) | 855 |
Skewness | 0 |
Sum | 5849910 |
Variance | 974985 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | < 0.1% |
2274 | 1 | < 0.1% |
2276 | 1 | < 0.1% |
2277 | 1 | < 0.1% |
2278 | 1 | < 0.1% |
2279 | 1 | < 0.1% |
2280 | 1 | < 0.1% |
2281 | 1 | < 0.1% |
2282 | 1 | < 0.1% |
2283 | 1 | < 0.1% |
Other values (3410) | 3410 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
3420 | 1 | |
3419 | 1 | |
3418 | 1 | |
3417 | 1 | |
3416 | 1 | |
3415 | 1 | |
3414 | 1 | |
3413 | 1 | |
3412 | 1 | |
3411 | 1 |
호선
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 9 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.8070175 |
Minimum | 1 |
---|---|
Maximum | 9 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 30.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2 |
Q1 | 3 |
median | 5 |
Q3 | 7 |
95-th percentile | 8 |
Maximum | 9 |
Range | 8 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.1624999 |
---|---|
Coefficient of variation (CV) | 0.44986312 |
Kurtosis | -0.99241848 |
Mean | 4.8070175 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 0.064046985 |
Sum | 16440 |
Variance | 4.6764058 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 672 | |
2 | 600 | |
7 | 504 | |
6 | 444 | |
3 | 396 | |
4 | 312 | |
8 | 216 | 6.3% |
9 | 156 | 4.6% |
1 | 120 | 3.5% |
Value | Count | Frequency (%) |
1 | 120 | 3.5% |
2 | 600 | |
3 | 396 | |
4 | 312 | |
5 | 672 | |
6 | 444 | |
7 | 504 | |
8 | 216 | 6.3% |
9 | 156 | 4.6% |
Value | Count | Frequency (%) |
9 | 156 | 4.6% |
8 | 216 | 6.3% |
7 | 504 | |
6 | 444 | |
5 | 672 | |
4 | 312 | |
3 | 396 | |
2 | 600 | |
1 | 120 | 3.5% |
고유역번호(외부역코드)
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 285 |
---|---|
Distinct (%) | 8.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1730.4456 |
Minimum | 150 |
---|---|
Maximum | 4138 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 30.2 KiB |
Quantile statistics
Minimum | 150 |
---|---|
5-th percentile | 205 |
Q1 | 320 |
median | 2534 |
Q3 | 2712 |
95-th percentile | 2827 |
Maximum | 4138 |
Range | 3988 |
Interquartile range (IQR) | 2392 |
Descriptive statistics
Standard deviation | 1260.5169 |
---|---|
Coefficient of variation (CV) | 0.72843483 |
Kurtosis | -1.5105945 |
Mean | 1730.4456 |
Median Absolute Deviation (MAD) | 284 |
Skewness | -0.10072894 |
Sum | 5918124 |
Variance | 1588902.7 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
150 | 12 | 0.4% |
2625 | 12 | 0.4% |
2631 | 12 | 0.4% |
2630 | 12 | 0.4% |
2629 | 12 | 0.4% |
2628 | 12 | 0.4% |
2627 | 12 | 0.4% |
2626 | 12 | 0.4% |
2624 | 12 | 0.4% |
2633 | 12 | 0.4% |
Other values (275) | 3300 |
Value | Count | Frequency (%) |
150 | 12 | |
151 | 12 | |
152 | 12 | |
153 | 12 | |
154 | 12 | |
155 | 12 | |
156 | 12 | |
157 | 12 | |
158 | 12 | |
159 | 12 |
Value | Count | Frequency (%) |
4138 | 12 | |
4137 | 12 | |
4136 | 12 | |
4135 | 12 | |
4134 | 12 | |
4133 | 12 | |
4132 | 12 | |
4131 | 12 | |
4130 | 12 | |
4129 | 12 |
역명
Text
Distinct | 285 |
---|---|
Distinct (%) | 8.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.8 KiB |
Value | Count | Frequency (%) |
서울역(1 | 12 | 0.4% |
약수(6 | 12 | 0.4% |
이태원 | 12 | 0.4% |
녹사평 | 12 | 0.4% |
삼각지(6 | 12 | 0.4% |
효창공원앞 | 12 | 0.4% |
공덕(6 | 12 | 0.4% |
대흥 | 12 | 0.4% |
상수 | 12 | 0.4% |
구산 | 12 | 0.4% |
Other values (275) | 3300 |
Most occurring characters
Value | Count | Frequency (%) |
( | 1044 | 7.9% |
) | 1044 | 7.9% |
대 | 384 | 2.9% |
구 | 336 | 2.5% |
동 | 276 | 2.1% |
신 | 264 | 2.0% |
산 | 228 | 1.7% |
5 | 216 | 1.6% |
원 | 192 | 1.4% |
2 | 192 | 1.4% |
Other values (213) | 9084 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 10044 | |
Decimal Number | 1128 | 8.5% |
Open Punctuation | 1044 | 7.9% |
Close Punctuation | 1044 | 7.9% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
대 | 384 | 3.8% |
구 | 336 | 3.3% |
동 | 276 | 2.7% |
신 | 264 | 2.6% |
산 | 228 | 2.3% |
원 | 192 | 1.9% |
지 | 180 | 1.8% |
문 | 180 | 1.8% |
로 | 168 | 1.7% |
입 | 168 | 1.7% |
Other values (202) | 7668 |
Decimal Number
Value | Count | Frequency (%) |
5 | 216 | |
2 | 192 | |
3 | 168 | |
6 | 132 | |
7 | 132 | |
4 | 108 | |
8 | 72 | 6.4% |
1 | 72 | 6.4% |
9 | 36 | 3.2% |
Open Punctuation
Value | Count | Frequency (%) |
( | 1044 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1044 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 10044 | |
Common | 3216 | 24.3% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
대 | 384 | 3.8% |
구 | 336 | 3.3% |
동 | 276 | 2.7% |
신 | 264 | 2.6% |
산 | 228 | 2.3% |
원 | 192 | 1.9% |
지 | 180 | 1.8% |
문 | 180 | 1.8% |
로 | 168 | 1.7% |
입 | 168 | 1.7% |
Other values (202) | 7668 |
Common
Value | Count | Frequency (%) |
( | 1044 | |
) | 1044 | |
5 | 216 | 6.7% |
2 | 192 | 6.0% |
3 | 168 | 5.2% |
6 | 132 | 4.1% |
7 | 132 | 4.1% |
4 | 108 | 3.4% |
8 | 72 | 2.2% |
1 | 72 | 2.2% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 10044 | |
ASCII | 3216 | 24.3% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
( | 1044 | |
) | 1044 | |
5 | 216 | 6.7% |
2 | 192 | 6.0% |
3 | 168 | 5.2% |
6 | 132 | 4.1% |
7 | 132 | 4.1% |
4 | 108 | 3.4% |
8 | 72 | 2.2% |
1 | 72 | 2.2% |
Hangul
Value | Count | Frequency (%) |
대 | 384 | 3.8% |
구 | 336 | 3.3% |
동 | 276 | 2.7% |
신 | 264 | 2.6% |
산 | 228 | 2.3% |
원 | 192 | 1.9% |
지 | 180 | 1.8% |
문 | 180 | 1.8% |
로 | 168 | 1.7% |
입 | 168 | 1.7% |
Other values (202) | 7668 |
수송연월
Categorical
HIGH CORRELATION
 
Distinct | 12 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 26.8 KiB |
2022-01 | |
---|---|
2022-02 | |
2022-03 | |
2022-04 | |
2022-05 | |
Other values (7) |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 7 |
Min length | 7 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2022-01 |
---|---|
2nd row | 2022-01 |
3rd row | 2022-01 |
4th row | 2022-01 |
5th row | 2022-01 |
Common Values
Value | Count | Frequency (%) |
2022-01 | 285 | |
2022-02 | 285 | |
2022-03 | 285 | |
2022-04 | 285 | |
2022-05 | 285 | |
2022-06 | 285 | |
2022-07 | 285 | |
2022-08 | 285 | |
2022-09 | 285 | |
2022-10 | 285 | |
Other values (2) | 570 |
Length
Value | Count | Frequency (%) |
2022-01 | 285 | |
2022-02 | 285 | |
2022-03 | 285 | |
2022-04 | 285 | |
2022-05 | 285 | |
2022-06 | 285 | |
2022-07 | 285 | |
2022-08 | 285 | |
2022-09 | 285 | |
2022-10 | 285 | |
Other values (2) | 570 |
수송인원수
Real number (ℝ)
Distinct | 3415 |
---|---|
Distinct (%) | 99.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 646515.9 |
Minimum | 35710 |
---|---|
Maximum | 3487771 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 30.2 KiB |
Quantile statistics
Minimum | 35710 |
---|---|
5-th percentile | 131222.95 |
Q1 | 316197.5 |
median | 514526 |
Q3 | 803657.5 |
95-th percentile | 1702785 |
Maximum | 3487771 |
Range | 3452061 |
Interquartile range (IQR) | 487460 |
Descriptive statistics
Standard deviation | 504783.03 |
---|---|
Coefficient of variation (CV) | 0.78077435 |
Kurtosis | 5.0127787 |
Mean | 646515.9 |
Median Absolute Deviation (MAD) | 226265.5 |
Skewness | 1.9727126 |
Sum | 2.2110844 × 109 |
Variance | 2.5480591 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
419526 | 2 | 0.1% |
215912 | 2 | 0.1% |
381652 | 2 | 0.1% |
743545 | 2 | 0.1% |
399622 | 2 | 0.1% |
1817867 | 1 | < 0.1% |
879118 | 1 | < 0.1% |
47304 | 1 | < 0.1% |
485532 | 1 | < 0.1% |
2301246 | 1 | < 0.1% |
Other values (3405) | 3405 |
Value | Count | Frequency (%) |
35710 | 1 | |
39500 | 1 | |
41259 | 1 | |
43648 | 1 | |
45579 | 1 | |
47192 | 1 | |
47304 | 1 | |
47879 | 1 | |
48396 | 1 | |
48800 | 1 |
Value | Count | Frequency (%) |
3487771 | 1 | |
3329938 | 1 | |
3324704 | 1 | |
3292051 | 1 | |
3283757 | 1 | |
3198271 | 1 | |
3192214 | 1 | |
3099704 | 1 | |
3058742 | 1 | |
3057157 | 1 |
연번 | 호선 | 고유역번호(외부역코드) | 수송연월 | 수송인원수 | |
---|---|---|---|---|---|
연번 | 1.000 | 0.204 | 0.181 | 0.959 | 0.065 |
호선 | 0.204 | 1.000 | 0.942 | 0.000 | 0.452 |
고유역번호(외부역코드) | 0.181 | 0.942 | 1.000 | 0.000 | 0.445 |
수송연월 | 0.959 | 0.000 | 0.000 | 1.000 | 0.034 |
수송인원수 | 0.065 | 0.452 | 0.445 | 0.034 | 1.000 |
연번 | 호선 | 고유역번호(외부역코드) | 수송인원수 | 수송연월 | |
---|---|---|---|---|---|
연번 | 1.000 | 0.082 | 0.083 | 0.080 | 0.838 |
호선 | 0.082 | 1.000 | 0.989 | -0.349 | 0.000 |
고유역번호(외부역코드) | 0.083 | 0.989 | 1.000 | -0.374 | 0.000 |
수송인원수 | 0.080 | -0.349 | -0.374 | 1.000 | 0.015 |
수송연월 | 0.838 | 0.000 | 0.000 | 0.015 | 1.000 |
연번 | 호선 | 고유역번호(외부역코드) | 역명 | 수송연월 | 수송인원수 | |
---|---|---|---|---|---|---|
0 | 1 | 1 | 150 | 서울역(1) | 2022-01 | 1817867 |
1 | 2 | 1 | 151 | 시청(1) | 2022-01 | 907397 |
2 | 3 | 1 | 152 | 종각 | 2022-01 | 1510584 |
3 | 4 | 1 | 153 | 종로3가(1) | 2022-01 | 1021499 |
4 | 5 | 1 | 154 | 종로5가 | 2022-01 | 1009266 |
5 | 6 | 1 | 155 | 동대문(1) | 2022-01 | 473477 |
6 | 7 | 1 | 156 | 신설동(1) | 2022-01 | 568506 |
7 | 8 | 1 | 157 | 제기동 | 2022-01 | 845594 |
8 | 9 | 1 | 158 | 청량리 | 2022-01 | 931116 |
9 | 10 | 1 | 159 | 동묘앞(1) | 2022-01 | 412234 |
연번 | 호선 | 고유역번호(외부역코드) | 역명 | 수송연월 | 수송인원수 | |
---|---|---|---|---|---|---|
3410 | 3411 | 9 | 4129 | 봉은사 | 2022-12 | 1144553 |
3411 | 3412 | 9 | 4130 | 종합운동장(9) | 2022-12 | 268140 |
3412 | 3413 | 9 | 4131 | 삼전 | 2022-12 | 363282 |
3413 | 3414 | 9 | 4132 | 석촌고분 | 2022-12 | 345068 |
3414 | 3415 | 9 | 4133 | 석촌(9) | 2022-12 | 450396 |
3415 | 3416 | 9 | 4134 | 송파나루 | 2022-12 | 303843 |
3416 | 3417 | 9 | 4135 | 한성백제 | 2022-12 | 131958 |
3417 | 3418 | 9 | 4136 | 올림픽공원(9) | 2022-12 | 487406 |
3418 | 3419 | 9 | 4137 | 둔촌오륜 | 2022-12 | 56621 |
3419 | 3420 | 9 | 4138 | 중앙보훈병원 | 2022-12 | 527143 |