Dataset statistics
Number of variables | 8 |
---|---|
Number of observations | 272 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 18.7 KiB |
Average record size in memory | 70.5 B |
Variable types
Numeric | 6 |
---|---|
Text | 1 |
Categorical | 1 |
Dataset
Description | 서울교통공사 역사별 세부 정보 데이터 입니다. 해당 데이터는 연번, 호선, 고유역번호, 역번호, 역명, 직원수, 수송인원, 운수수입, 비고 항목으로 구성되어 있습니다. |
---|---|
URL | https://www.data.go.kr/data/15107020/fileData.do |
연번 is highly overall correlated with 호선 and 3 other fields | High correlation |
호선 is highly overall correlated with 연번 and 3 other fields | High correlation |
역번호 is highly overall correlated with 연번 and 3 other fields | High correlation |
직원수 is highly overall correlated with 연번 and 5 other fields | High correlation |
수송인원 is highly overall correlated with 직원수 and 2 other fields | High correlation |
운수수입 is highly overall correlated with 직원수 and 2 other fields | High correlation |
비고 is highly overall correlated with 연번 and 5 other fields | High correlation |
비고 is highly imbalanced (80.9%) | Imbalance |
연번 has unique values | Unique |
역번호 has unique values | Unique |
역명 has unique values | Unique |
수송인원 has unique values | Unique |
운수수입 has unique values | Unique |
직원수 has 7 (2.6%) zeros | Zeros |
Reproduction
Analysis started | 2023-12-12 00:34:47.148874 |
---|---|
Analysis finished | 2023-12-12 00:34:50.089688 |
Duration | 2.94 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
연번
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 272 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 136.5 |
Minimum | 1 |
---|---|
Maximum | 272 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 14.55 |
Q1 | 68.75 |
median | 136.5 |
Q3 | 204.25 |
95-th percentile | 258.45 |
Maximum | 272 |
Range | 271 |
Interquartile range (IQR) | 135.5 |
Descriptive statistics
Standard deviation | 78.663842 |
---|---|
Coefficient of variation (CV) | 0.57629188 |
Kurtosis | -1.2 |
Mean | 136.5 |
Median Absolute Deviation (MAD) | 68 |
Skewness | 0 |
Sum | 37128 |
Variance | 6188 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
1 | 1 | 0.4% |
181 | 1 | 0.4% |
187 | 1 | 0.4% |
186 | 1 | 0.4% |
185 | 1 | 0.4% |
184 | 1 | 0.4% |
183 | 1 | 0.4% |
182 | 1 | 0.4% |
180 | 1 | 0.4% |
138 | 1 | 0.4% |
Other values (262) | 262 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 1 |
Value | Count | Frequency (%) |
272 | 1 | |
271 | 1 | |
270 | 1 | |
269 | 1 | |
268 | 1 | |
267 | 1 | |
266 | 1 | |
265 | 1 | |
264 | 1 | |
263 | 1 |
호선
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 8 |
---|---|
Distinct (%) | 2.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.6066176 |
Minimum | 1 |
---|---|
Maximum | 8 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2 |
Q1 | 3 |
median | 5 |
Q3 | 6 |
95-th percentile | 8 |
Maximum | 8 |
Range | 7 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 2.008201 |
---|---|
Coefficient of variation (CV) | 0.43593828 |
Kurtosis | -1.1478026 |
Mean | 4.6066176 |
Median Absolute Deviation (MAD) | 2 |
Skewness | -0.052624328 |
Sum | 1253 |
Variance | 4.0328712 |
Monotonicity | Increasing |
Value | Count | Frequency (%) |
5 | 56 | |
2 | 50 | |
7 | 42 | |
6 | 37 | |
3 | 33 | |
4 | 26 | |
8 | 18 | 6.6% |
1 | 10 | 3.7% |
Value | Count | Frequency (%) |
1 | 10 | 3.7% |
2 | 50 | |
3 | 33 | |
4 | 26 | |
5 | 56 | |
6 | 37 | |
7 | 42 | |
8 | 18 | 6.6% |
Value | Count | Frequency (%) |
8 | 18 | 6.6% |
7 | 42 | |
6 | 37 | |
5 | 56 | |
4 | 26 | |
3 | 33 | |
2 | 50 | |
1 | 10 | 3.7% |
역번호
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 272 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1615.6654 |
Minimum | 150 |
---|---|
Maximum | 2828 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 150 |
---|---|
5-th percentile | 204.55 |
Q1 | 316.75 |
median | 2527.5 |
Q3 | 2640.25 |
95-th percentile | 2814.45 |
Maximum | 2828 |
Range | 2678 |
Interquartile range (IQR) | 2323.5 |
Descriptive statistics
Standard deviation | 1174.9919 |
---|---|
Coefficient of variation (CV) | 0.7272495 |
Kurtosis | -1.9259226 |
Mean | 1615.6654 |
Median Absolute Deviation (MAD) | 284 |
Skewness | -0.24809565 |
Sum | 439461 |
Variance | 1380605.9 |
Monotonicity | Strictly increasing |
Value | Count | Frequency (%) |
150 | 1 | 0.4% |
2617 | 1 | 0.4% |
2623 | 1 | 0.4% |
2622 | 1 | 0.4% |
2621 | 1 | 0.4% |
2620 | 1 | 0.4% |
2619 | 1 | 0.4% |
2618 | 1 | 0.4% |
2616 | 1 | 0.4% |
2529 | 1 | 0.4% |
Other values (262) | 262 |
Value | Count | Frequency (%) |
150 | 1 | |
151 | 1 | |
152 | 1 | |
153 | 1 | |
154 | 1 | |
155 | 1 | |
156 | 1 | |
157 | 1 | |
158 | 1 | |
159 | 1 |
Value | Count | Frequency (%) |
2828 | 1 | |
2827 | 1 | |
2826 | 1 | |
2825 | 1 | |
2824 | 1 | |
2823 | 1 | |
2822 | 1 | |
2821 | 1 | |
2820 | 1 | |
2819 | 1 |
역명
Text
UNIQUE
 
Distinct | 272 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
서울역(1 | 1 | 0.4% |
독바위 | 1 | 0.4% |
하남시청 | 1 | 0.4% |
하남검단산 | 1 | 0.4% |
응암 | 1 | 0.4% |
역촌 | 1 | 0.4% |
불광(6 | 1 | 0.4% |
시청(1 | 1 | 0.4% |
새절 | 1 | 0.4% |
청량리 | 1 | 0.4% |
Other values (262) | 262 |
Most occurring characters
Value | Count | Frequency (%) |
( | 84 | 8.0% |
) | 84 | 8.0% |
대 | 32 | 3.1% |
구 | 28 | 2.7% |
동 | 22 | 2.1% |
신 | 22 | 2.1% |
산 | 19 | 1.8% |
5 | 18 | 1.7% |
2 | 16 | 1.5% |
문 | 15 | 1.4% |
Other values (204) | 708 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 789 | |
Decimal Number | 91 | 8.7% |
Open Punctuation | 84 | 8.0% |
Close Punctuation | 84 | 8.0% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
대 | 32 | 4.1% |
구 | 28 | 3.5% |
동 | 22 | 2.8% |
신 | 22 | 2.8% |
산 | 19 | 2.4% |
문 | 15 | 1.9% |
지 | 15 | 1.9% |
로 | 14 | 1.8% |
입 | 14 | 1.8% |
원 | 14 | 1.8% |
Other values (194) | 594 |
Decimal Number
Value | Count | Frequency (%) |
5 | 18 | |
2 | 16 | |
3 | 14 | |
6 | 11 | |
7 | 11 | |
4 | 9 | |
1 | 6 | 6.6% |
8 | 6 | 6.6% |
Open Punctuation
Value | Count | Frequency (%) |
( | 84 |
Close Punctuation
Value | Count | Frequency (%) |
) | 84 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 789 | |
Common | 259 | 24.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
대 | 32 | 4.1% |
구 | 28 | 3.5% |
동 | 22 | 2.8% |
신 | 22 | 2.8% |
산 | 19 | 2.4% |
문 | 15 | 1.9% |
지 | 15 | 1.9% |
로 | 14 | 1.8% |
입 | 14 | 1.8% |
원 | 14 | 1.8% |
Other values (194) | 594 |
Common
Value | Count | Frequency (%) |
( | 84 | |
) | 84 | |
5 | 18 | 6.9% |
2 | 16 | 6.2% |
3 | 14 | 5.4% |
6 | 11 | 4.2% |
7 | 11 | 4.2% |
4 | 9 | 3.5% |
1 | 6 | 2.3% |
8 | 6 | 2.3% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 789 | |
ASCII | 259 | 24.7% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
( | 84 | |
) | 84 | |
5 | 18 | 6.9% |
2 | 16 | 6.2% |
3 | 14 | 5.4% |
6 | 11 | 4.2% |
7 | 11 | 4.2% |
4 | 9 | 3.5% |
1 | 6 | 2.3% |
8 | 6 | 2.3% |
Hangul
Value | Count | Frequency (%) |
대 | 32 | 4.1% |
구 | 28 | 3.5% |
동 | 22 | 2.8% |
신 | 22 | 2.8% |
산 | 19 | 2.4% |
문 | 15 | 1.9% |
지 | 15 | 1.9% |
로 | 14 | 1.8% |
입 | 14 | 1.8% |
원 | 14 | 1.8% |
Other values (194) | 594 |
직원수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 19 |
---|---|
Distinct (%) | 7.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12.393382 |
Minimum | 0 |
---|---|
Maximum | 25 |
Zeros | 7 |
Zeros (%) | 2.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 9 |
Q1 | 10 |
median | 12 |
Q3 | 14 |
95-th percentile | 18.45 |
Maximum | 25 |
Range | 25 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 3.729264 |
---|---|
Coefficient of variation (CV) | 0.30090768 |
Kurtosis | 2.7911915 |
Mean | 12.393382 |
Median Absolute Deviation (MAD) | 2 |
Skewness | -0.077940894 |
Sum | 3371 |
Variance | 13.90741 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
10 | 66 | |
13 | 48 | |
14 | 25 | 9.2% |
11 | 23 | 8.5% |
9 | 21 | 7.7% |
16 | 17 | 6.2% |
15 | 16 | 5.9% |
12 | 16 | 5.9% |
17 | 9 | 3.3% |
0 | 7 | 2.6% |
Other values (9) | 24 | 8.8% |
Value | Count | Frequency (%) |
0 | 7 | 2.6% |
8 | 4 | 1.5% |
9 | 21 | 7.7% |
10 | 66 | |
11 | 23 | 8.5% |
12 | 16 | 5.9% |
13 | 48 | |
14 | 25 | 9.2% |
15 | 16 | 5.9% |
16 | 17 | 6.2% |
Value | Count | Frequency (%) |
25 | 1 | 0.4% |
24 | 1 | 0.4% |
23 | 2 | 0.7% |
22 | 2 | 0.7% |
21 | 3 | 1.1% |
20 | 1 | 0.4% |
19 | 4 | 1.5% |
18 | 6 | 2.2% |
17 | 9 | |
16 | 17 |
수송인원
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 272 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7918056.3 |
Minimum | 669461 |
---|---|
Maximum | 36184549 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 669461 |
---|---|
5-th percentile | 1731406.9 |
Q1 | 3985068.2 |
median | 6462403.5 |
Q3 | 9776761.8 |
95-th percentile | 20332837 |
Maximum | 36184549 |
Range | 35515088 |
Interquartile range (IQR) | 5791693.5 |
Descriptive statistics
Standard deviation | 6041102.6 |
---|---|
Coefficient of variation (CV) | 0.76295272 |
Kurtosis | 4.4382625 |
Mean | 7918056.3 |
Median Absolute Deviation (MAD) | 2753342 |
Skewness | 1.8839011 |
Sum | 2.1537113 × 109 |
Variance | 3.649492 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
26147934 | 1 | 0.4% |
6492776 | 1 | 0.4% |
5950511 | 1 | 0.4% |
7790745 | 1 | 0.4% |
7001749 | 1 | 0.4% |
2971823 | 1 | 0.4% |
7232219 | 1 | 0.4% |
5245476 | 1 | 0.4% |
3679235 | 1 | 0.4% |
7226231 | 1 | 0.4% |
Other values (262) | 262 |
Value | Count | Frequency (%) |
669461 | 1 | |
705465 | 1 | |
707705 | 1 | |
949171 | 1 | |
1033089 | 1 | |
1065634 | 1 | |
1139361 | 1 | |
1268406 | 1 | |
1323311 | 1 | |
1357138 | 1 |
Value | Count | Frequency (%) |
36184549 | 1 | |
34209780 | 1 | |
33488236 | 1 | |
28109759 | 1 | |
27262648 | 1 | |
26147934 | 1 | |
24498111 | 1 | |
24367213 | 1 | |
24183085 | 1 | |
23691521 | 1 |
운수수입
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 272 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.914783 × 109 |
Minimum | 3.8038068 × 108 |
---|---|
Maximum | 2.7481205 × 1010 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.5 KiB |
Quantile statistics
Minimum | 3.8038068 × 108 |
---|---|
5-th percentile | 1.0615402 × 109 |
Q1 | 2.3128102 × 109 |
median | 3.9254965 × 109 |
Q3 | 5.8230576 × 109 |
95-th percentile | 1.3304541 × 1010 |
Maximum | 2.7481205 × 1010 |
Range | 2.7100825 × 1010 |
Interquartile range (IQR) | 3.5102474 × 109 |
Descriptive statistics
Standard deviation | 4.0803786 × 109 |
---|---|
Coefficient of variation (CV) | 0.83022558 |
Kurtosis | 6.9320962 |
Mean | 4.914783 × 109 |
Median Absolute Deviation (MAD) | 1.728287 × 109 |
Skewness | 2.2808703 |
Sum | 1.336821 × 1012 |
Variance | 1.6649489 × 1019 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
15927243582 | 1 | 0.4% |
3349927405 | 1 | 0.4% |
4428010776 | 1 | 0.4% |
4456146331 | 1 | 0.4% |
4272881498 | 1 | 0.4% |
1352838588 | 1 | 0.4% |
3811517829 | 1 | 0.4% |
2880898637 | 1 | 0.4% |
2259195960 | 1 | 0.4% |
4418945658 | 1 | 0.4% |
Other values (262) | 262 |
Value | Count | Frequency (%) |
380380680 | 1 | |
401515910 | 1 | |
413668919 | 1 | |
468338892 | 1 | |
581406995 | 1 | |
604125558 | 1 | |
711998838 | 1 | |
765376031 | 1 | |
792845652 | 1 | |
936198836 | 1 |
Value | Count | Frequency (%) |
27481205327 | 1 | |
24665863082 | 1 | |
22622478425 | 1 | |
18401107571 | 1 | |
17950178469 | 1 | |
17406845872 | 1 | |
16802404107 | 1 | |
16156923598 | 1 | |
15927243582 | 1 | |
15766554313 | 1 |
비고
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
<NA> | |
---|---|
통합환승역 | 8 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.0294118 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 264 | |
통합환승역 | 8 | 2.9% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 264 | |
통합환승역 | 8 | 2.9% |
연번 | 호선 | 역번호 | 직원수 | 수송인원 | 운수수입 | |
---|---|---|---|---|---|---|
연번 | 1.000 | 0.918 | 0.916 | 0.432 | 0.422 | 0.417 |
호선 | 0.918 | 1.000 | 0.994 | 0.632 | 0.401 | 0.368 |
역번호 | 0.916 | 0.994 | 1.000 | 0.630 | 0.369 | 0.325 |
직원수 | 0.432 | 0.632 | 0.630 | 1.000 | 0.677 | 0.669 |
수송인원 | 0.422 | 0.401 | 0.369 | 0.677 | 1.000 | 0.961 |
운수수입 | 0.417 | 0.368 | 0.325 | 0.669 | 0.961 | 1.000 |
연번 | 호선 | 역번호 | 직원수 | 수송인원 | 운수수입 | 비고 | |
---|---|---|---|---|---|---|---|
연번 | 1.000 | 0.988 | 1.000 | -0.533 | -0.361 | -0.342 | 1.000 |
호선 | 0.988 | 1.000 | 0.988 | -0.526 | -0.334 | -0.312 | 1.000 |
역번호 | 1.000 | 0.988 | 1.000 | -0.533 | -0.361 | -0.342 | 1.000 |
직원수 | -0.533 | -0.526 | -0.533 | 1.000 | 0.621 | 0.605 | 1.000 |
수송인원 | -0.361 | -0.334 | -0.361 | 0.621 | 1.000 | 0.981 | 1.000 |
운수수입 | -0.342 | -0.312 | -0.342 | 0.605 | 0.981 | 1.000 | 1.000 |
비고 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
연번 | 호선 | 역번호 | 역명 | 직원수 | 수송인원 | 운수수입 | 비고 | |
---|---|---|---|---|---|---|---|---|
0 | 1 | 1 | 150 | 서울역(1) | 22 | 26147934 | 15927243582 | <NA> |
1 | 2 | 1 | 151 | 시청(1) | 16 | 13417043 | 7774169453 | <NA> |
2 | 3 | 1 | 152 | 종각 | 16 | 20637943 | 11798053057 | <NA> |
3 | 4 | 1 | 153 | 종로3가(1) | 20 | 13532303 | 6281939386 | <NA> |
4 | 5 | 1 | 154 | 종로5가 | 16 | 13602294 | 5543430536 | <NA> |
5 | 6 | 1 | 155 | 동대문(1) | 15 | 6224288 | 2914335075 | <NA> |
6 | 7 | 1 | 156 | 신설동(1) | 17 | 7288137 | 3505996122 | <NA> |
7 | 8 | 1 | 157 | 제기동 | 15 | 10106679 | 3084662385 | <NA> |
8 | 9 | 1 | 158 | 청량리 | 21 | 11659963 | 3988614651 | <NA> |
9 | 10 | 1 | 159 | 동묘앞(1) | 19 | 5424831 | 1744747941 | <NA> |
연번 | 호선 | 역번호 | 역명 | 직원수 | 수송인원 | 운수수입 | 비고 | |
---|---|---|---|---|---|---|---|---|
262 | 263 | 8 | 2819 | 문정 | 10 | 11255869 | 7059110124 | <NA> |
263 | 264 | 8 | 2820 | 장지 | 10 | 9171769 | 5305520797 | <NA> |
264 | 265 | 8 | 2821 | 복정(8) | 11 | 5162842 | 2657913479 | 통합환승역 |
265 | 266 | 8 | 2822 | 산성 | 10 | 2834299 | 1684784296 | <NA> |
266 | 267 | 8 | 2823 | 남한산성입구 | 10 | 7238787 | 4217169675 | <NA> |
267 | 268 | 8 | 2824 | 단대오거리 | 9 | 5601289 | 3243188905 | <NA> |
268 | 269 | 8 | 2825 | 신흥 | 10 | 2346044 | 1310608599 | <NA> |
269 | 270 | 8 | 2826 | 수진 | 8 | 2669274 | 1598827339 | <NA> |
270 | 271 | 8 | 2827 | 모란(8) | 12 | 1842903 | 1064418380 | <NA> |
271 | 272 | 8 | 2828 | 남위례 | 11 | 2798414 | 1694702043 | <NA> |