Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 500 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 37.7 KiB |
Average record size in memory | 77.3 B |
Variable types
DateTime | 1 |
---|---|
Numeric | 4 |
Categorical | 3 |
Text | 1 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 서울시(스마트카드사) |
URL | https://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=14 |
호선ID is highly overall correlated with 역ID and 1 other fields | High correlation |
역ID is highly overall correlated with 호선ID and 1 other fields | High correlation |
승차총승객수 is highly overall correlated with 하차총승객수 | High correlation |
하차총승객수 is highly overall correlated with 승차총승객수 | High correlation |
호선 is highly overall correlated with 호선ID and 1 other fields | High correlation |
승차총승객수 has 8 (1.6%) zeros | Zeros |
하차총승객수 has 15 (3.0%) zeros | Zeros |
Reproduction
Analysis started | 2024-04-16 11:26:20.862027 |
---|---|
Analysis finished | 2024-04-16 11:26:22.978626 |
Duration | 2.12 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
운행일자
Date
Distinct | 360 |
---|---|
Distinct (%) | 72.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Minimum | 2014-01-02 00:00:00 |
---|---|
Maximum | 2015-10-31 00:00:00 |
호선ID
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 10 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 149.978 |
Minimum | 1 |
---|---|
Maximum | 404 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2 |
Q1 | 3 |
median | 205 |
Q3 | 207 |
95-th percentile | 401 |
Maximum | 404 |
Range | 403 |
Interquartile range (IQR) | 204 |
Descriptive statistics
Standard deviation | 128.22538 |
---|---|
Coefficient of variation (CV) | 0.85496129 |
Kurtosis | -0.76439235 |
Mean | 149.978 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 0.26144135 |
Sum | 74989 |
Variance | 16441.749 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
205 | 79 | |
2 | 78 | |
207 | 77 | |
206 | 75 | |
3 | 60 | |
401 | 39 | |
4 | 32 | |
208 | 29 | 5.8% |
1 | 18 | 3.6% |
404 | 13 | 2.6% |
Value | Count | Frequency (%) |
1 | 18 | 3.6% |
2 | 78 | |
3 | 60 | |
4 | 32 | |
205 | 79 | |
206 | 75 | |
207 | 77 | |
208 | 29 | 5.8% |
401 | 39 | |
404 | 13 | 2.6% |
Value | Count | Frequency (%) |
404 | 13 | 2.6% |
401 | 39 | |
208 | 29 | 5.8% |
207 | 77 | |
206 | 75 | |
205 | 79 | |
4 | 32 | |
3 | 60 | |
2 | 78 | |
1 | 18 | 3.6% |
호선
Categorical
HIGH CORRELATION
 
Distinct | 10 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
5호선 | |
---|---|
2호선 | |
7호선 | |
6호선 | |
3호선 | |
Other values (5) |
Length
Max length | 6 |
---|---|
Median length | 3 |
Mean length | 3.078 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 6호선 |
---|---|
2nd row | 8호선 |
3rd row | 6호선 |
4th row | 6호선 |
5th row | 2호선 |
Common Values
Value | Count | Frequency (%) |
5호선 | 79 | |
2호선 | 78 | |
7호선 | 77 | |
6호선 | 75 | |
3호선 | 60 | |
9호선 | 39 | |
4호선 | 32 | |
8호선 | 29 | 5.8% |
1호선 | 18 | 3.6% |
9호선2단계 | 13 | 2.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
5호선 | 79 | |
2호선 | 78 | |
7호선 | 77 | |
6호선 | 75 | |
3호선 | 60 | |
9호선 | 39 | |
4호선 | 32 | |
8호선 | 29 | 5.8% |
1호선 | 18 | 3.6% |
9호선2단계 | 13 | 2.6% |
역ID
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 243 |
---|---|
Distinct (%) | 48.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1914.57 |
Minimum | 150 |
---|---|
Maximum | 4130 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 150 |
---|---|
5-th percentile | 203 |
Q1 | 324 |
median | 2547.5 |
Q3 | 2730.5 |
95-th percentile | 4117.05 |
Maximum | 4130 |
Range | 3980 |
Interquartile range (IQR) | 2406.5 |
Descriptive statistics
Standard deviation | 1341.879 |
---|---|
Coefficient of variation (CV) | 0.7008775 |
Kurtosis | -1.3523634 |
Mean | 1914.57 |
Median Absolute Deviation (MAD) | 273.5 |
Skewness | -0.13006505 |
Sum | 957285 |
Variance | 1800639.4 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
309 | 6 | 1.2% |
2537 | 5 | 1.0% |
4117 | 5 | 1.0% |
2730 | 5 | 1.0% |
226 | 5 | 1.0% |
2526 | 5 | 1.0% |
4126 | 4 | 0.8% |
4110 | 4 | 0.8% |
424 | 4 | 0.8% |
4125 | 4 | 0.8% |
Other values (233) | 453 |
Value | Count | Frequency (%) |
150 | 1 | 0.2% |
151 | 3 | |
153 | 2 | |
155 | 1 | 0.2% |
156 | 1 | 0.2% |
157 | 3 | |
158 | 4 | |
159 | 3 | |
201 | 3 | |
202 | 3 |
Value | Count | Frequency (%) |
4130 | 2 | |
4129 | 4 | |
4128 | 2 | |
4127 | 1 | 0.2% |
4126 | 4 | |
4125 | 4 | |
4124 | 1 | 0.2% |
4123 | 2 | |
4122 | 2 | |
4121 | 1 | 0.2% |
역
Text
Distinct | 217 |
---|---|
Distinct (%) | 43.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Value | Count | Frequency (%) |
동대문역사문화공원 | 9 | 1.8% |
영등포구청 | 8 | 1.6% |
천호 | 7 | 1.4% |
고속터미널 | 7 | 1.4% |
을지로3가 | 6 | 1.2% |
지축 | 6 | 1.2% |
시청 | 6 | 1.2% |
가락시장 | 6 | 1.2% |
신길 | 5 | 1.0% |
종로3가 | 5 | 1.0% |
Other values (207) | 435 |
Most occurring characters
Value | Count | Frequency (%) |
대 | 56 | 3.5% |
구 | 53 | 3.3% |
신 | 50 | 3.1% |
동 | 49 | 3.1% |
지 | 37 | 2.3% |
청 | 35 | 2.2% |
원 | 35 | 2.2% |
사 | 32 | 2.0% |
산 | 27 | 1.7% |
문 | 26 | 1.6% |
Other values (191) | 1189 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 1552 | |
Decimal Number | 13 | 0.8% |
Open Punctuation | 12 | 0.8% |
Close Punctuation | 12 | 0.8% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
대 | 56 | 3.6% |
구 | 53 | 3.4% |
신 | 50 | 3.2% |
동 | 49 | 3.2% |
지 | 37 | 2.4% |
청 | 35 | 2.3% |
원 | 35 | 2.3% |
사 | 32 | 2.1% |
산 | 27 | 1.7% |
문 | 26 | 1.7% |
Other values (187) | 1152 |
Decimal Number
Value | Count | Frequency (%) |
3 | 11 | |
4 | 2 | 15.4% |
Open Punctuation
Value | Count | Frequency (%) |
( | 12 |
Close Punctuation
Value | Count | Frequency (%) |
) | 12 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 1552 | |
Common | 37 | 2.3% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
대 | 56 | 3.6% |
구 | 53 | 3.4% |
신 | 50 | 3.2% |
동 | 49 | 3.2% |
지 | 37 | 2.4% |
청 | 35 | 2.3% |
원 | 35 | 2.3% |
사 | 32 | 2.1% |
산 | 27 | 1.7% |
문 | 26 | 1.7% |
Other values (187) | 1152 |
Common
Value | Count | Frequency (%) |
( | 12 | |
) | 12 | |
3 | 11 | |
4 | 2 | 5.4% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 1552 | |
ASCII | 37 | 2.3% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
대 | 56 | 3.6% |
구 | 53 | 3.4% |
신 | 50 | 3.2% |
동 | 49 | 3.2% |
지 | 37 | 2.4% |
청 | 35 | 2.3% |
원 | 35 | 2.3% |
사 | 32 | 2.1% |
산 | 27 | 1.7% |
문 | 26 | 1.7% |
Other values (187) | 1152 |
ASCII
Value | Count | Frequency (%) |
( | 12 | |
) | 12 | |
3 | 11 | |
4 | 2 | 5.4% |
승차시간구분
Categorical
Distinct | 22 |
---|---|
Distinct (%) | 4.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
05:00:00~05:59:59 | 35 |
---|---|
00:00:00~00:59:59 | 29 |
14:00:00~14:59:59 | 29 |
23:00:00~23:59:59 | 28 |
08:00:00~08:59:59 | 27 |
Other values (17) |
Length
Max length | 17 |
---|---|
Median length | 17 |
Mean length | 17 |
Min length | 17 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 21:00:00~21:59:59 |
---|---|
2nd row | 22:00:00~22:59:59 |
3rd row | 17:00:00~17:59:59 |
4th row | 12:00:00~12:59:59 |
5th row | 09:00:00~09:59:59 |
Common Values
Value | Count | Frequency (%) |
05:00:00~05:59:59 | 35 | 7.0% |
00:00:00~00:59:59 | 29 | 5.8% |
14:00:00~14:59:59 | 29 | 5.8% |
23:00:00~23:59:59 | 28 | 5.6% |
08:00:00~08:59:59 | 27 | 5.4% |
21:00:00~21:59:59 | 27 | 5.4% |
22:00:00~22:59:59 | 26 | 5.2% |
19:00:00~19:59:59 | 26 | 5.2% |
16:00:00~16:59:59 | 25 | 5.0% |
17:00:00~17:59:59 | 25 | 5.0% |
Other values (12) | 223 |
Length
Value | Count | Frequency (%) |
05:00:00~05:59:59 | 35 | 7.0% |
00:00:00~00:59:59 | 29 | 5.8% |
14:00:00~14:59:59 | 29 | 5.8% |
23:00:00~23:59:59 | 28 | 5.6% |
08:00:00~08:59:59 | 27 | 5.4% |
21:00:00~21:59:59 | 27 | 5.4% |
22:00:00~22:59:59 | 26 | 5.2% |
19:00:00~19:59:59 | 26 | 5.2% |
16:00:00~16:59:59 | 25 | 5.0% |
17:00:00~17:59:59 | 25 | 5.0% |
Other values (12) | 223 |
30분시간구간ID
Categorical
Distinct | 2 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
0 | |
---|---|
30 |
Length
Max length | 2 |
---|---|
Median length | 1 |
Mean length | 1.474 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 30 |
---|---|
2nd row | 0 |
3rd row | 30 |
4th row | 0 |
5th row | 30 |
Common Values
Value | Count | Frequency (%) |
0 | 263 | |
30 | 237 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
0 | 263 | |
30 | 237 |
승차총승객수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 366 |
---|---|
Distinct (%) | 73.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 415.954 |
Minimum | 0 |
---|---|
Maximum | 5463 |
Zeros | 8 |
Zeros (%) | 1.6% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 10.95 |
Q1 | 84 |
median | 241.5 |
Q3 | 492.5 |
95-th percentile | 1480.15 |
Maximum | 5463 |
Range | 5463 |
Interquartile range (IQR) | 408.5 |
Descriptive statistics
Standard deviation | 574.5776 |
---|---|
Coefficient of variation (CV) | 1.3813489 |
Kurtosis | 21.003083 |
Mean | 415.954 |
Median Absolute Deviation (MAD) | 175.5 |
Skewness | 3.7541424 |
Sum | 207977 |
Variance | 330139.41 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 8 | 1.6% |
1 | 7 | 1.4% |
66 | 7 | 1.4% |
23 | 6 | 1.2% |
240 | 5 | 1.0% |
226 | 4 | 0.8% |
203 | 4 | 0.8% |
41 | 4 | 0.8% |
277 | 3 | 0.6% |
181 | 3 | 0.6% |
Other values (356) | 449 |
Value | Count | Frequency (%) |
0 | 8 | |
1 | 7 | |
2 | 1 | 0.2% |
3 | 2 | 0.4% |
4 | 1 | 0.2% |
5 | 1 | 0.2% |
7 | 2 | 0.4% |
8 | 1 | 0.2% |
10 | 2 | 0.4% |
11 | 1 | 0.2% |
Value | Count | Frequency (%) |
5463 | 1 | |
4309 | 1 | |
3889 | 1 | |
3350 | 1 | |
2999 | 1 | |
2771 | 1 | |
2641 | 1 | |
2481 | 1 | |
2197 | 1 | |
2138 | 1 |
하차총승객수
Real number (ℝ)
HIGH CORRELATION
  ZEROS
 
Distinct | 373 |
---|---|
Distinct (%) | 74.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 399.718 |
Minimum | 0 |
---|---|
Maximum | 5554 |
Zeros | 15 |
Zeros (%) | 3.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 7.95 |
Q1 | 92.75 |
median | 233 |
Q3 | 499.5 |
95-th percentile | 1335.1 |
Maximum | 5554 |
Range | 5554 |
Interquartile range (IQR) | 406.75 |
Descriptive statistics
Standard deviation | 512.14107 |
---|---|
Coefficient of variation (CV) | 1.281256 |
Kurtosis | 25.033933 |
Mean | 399.718 |
Median Absolute Deviation (MAD) | 171.5 |
Skewness | 3.7949991 |
Sum | 199859 |
Variance | 262288.47 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 15 | 3.0% |
31 | 5 | 1.0% |
197 | 4 | 0.8% |
79 | 4 | 0.8% |
295 | 4 | 0.8% |
69 | 4 | 0.8% |
25 | 4 | 0.8% |
495 | 3 | 0.6% |
14 | 3 | 0.6% |
59 | 3 | 0.6% |
Other values (363) | 451 |
Value | Count | Frequency (%) |
0 | 15 | |
1 | 3 | 0.6% |
3 | 3 | 0.6% |
4 | 2 | 0.4% |
6 | 1 | 0.2% |
7 | 1 | 0.2% |
8 | 2 | 0.4% |
9 | 1 | 0.2% |
10 | 1 | 0.2% |
12 | 2 | 0.4% |
Value | Count | Frequency (%) |
5554 | 1 | |
3390 | 1 | |
3009 | 1 | |
2659 | 1 | |
2318 | 1 | |
2087 | 1 | |
2069 | 1 | |
2059 | 1 | |
1961 | 1 | |
1935 | 1 |
호선ID | 호선 | 역ID | 승차시간구분 | 30분시간구간ID | 승차총승객수 | 하차총승객수 | |
---|---|---|---|---|---|---|---|
호선ID | 1.000 | 1.000 | 1.000 | 0.113 | 0.000 | 0.383 | 0.293 |
호선 | 1.000 | 1.000 | 0.971 | 0.128 | 0.000 | 0.260 | 0.270 |
역ID | 1.000 | 0.971 | 1.000 | 0.100 | 0.000 | 0.197 | 0.313 |
승차시간구분 | 0.113 | 0.128 | 0.100 | 1.000 | 0.202 | 0.281 | 0.199 |
30분시간구간ID | 0.000 | 0.000 | 0.000 | 0.202 | 1.000 | 0.000 | 0.044 |
승차총승객수 | 0.383 | 0.260 | 0.197 | 0.281 | 0.000 | 1.000 | 0.732 |
하차총승객수 | 0.293 | 0.270 | 0.313 | 0.199 | 0.044 | 0.732 | 1.000 |
호선 | 승차시간구분 | 30분시간구간ID | |
---|---|---|---|
호선 | 1.000 | 0.046 | 0.000 |
승차시간구분 | 0.046 | 1.000 | 0.156 |
30분시간구간ID | 0.000 | 0.156 | 1.000 |
호선ID | 역ID | 승차총승객수 | 하차총승객수 | 호선 | 승차시간구분 | 30분시간구간ID | |
---|---|---|---|---|---|---|---|
호선ID | 1.000 | 0.991 | -0.142 | -0.125 | 0.993 | 0.056 | 0.000 |
역ID | 0.991 | 1.000 | -0.143 | -0.127 | 0.909 | 0.053 | 0.000 |
승차총승객수 | -0.142 | -0.143 | 1.000 | 0.751 | 0.120 | 0.111 | 0.000 |
하차총승객수 | -0.125 | -0.127 | 0.751 | 1.000 | 0.131 | 0.080 | 0.033 |
호선 | 0.993 | 0.909 | 0.120 | 0.131 | 1.000 | 0.046 | 0.000 |
승차시간구분 | 0.056 | 0.053 | 0.111 | 0.080 | 0.046 | 1.000 | 0.156 |
30분시간구간ID | 0.000 | 0.000 | 0.000 | 0.033 | 0.000 | 0.156 | 1.000 |
운행일자 | 호선ID | 호선 | 역ID | 역 | 승차시간구분 | 30분시간구간ID | 승차총승객수 | 하차총승객수 | |
---|---|---|---|---|---|---|---|---|---|
0 | 2014-05-21 | 206 | 6호선 | 2626 | 대흥 | 21:00:00~21:59:59 | 30 | 110 | 143 |
1 | 2015-01-15 | 208 | 8호선 | 2826 | 수진 | 22:00:00~22:59:59 | 0 | 69 | 165 |
2 | 2014-07-18 | 206 | 6호선 | 2647 | 화랑대 | 17:00:00~17:59:59 | 30 | 544 | 414 |
3 | 2015-01-19 | 206 | 6호선 | 2633 | 버티고개 | 12:00:00~12:59:59 | 0 | 59 | 46 |
4 | 2014-06-18 | 2 | 2호선 | 247 | 도림천 | 09:00:00~09:59:59 | 30 | 35 | 67 |
5 | 2015-05-08 | 206 | 6호선 | 2617 | 새절 | 11:00:00~11:59:59 | 30 | 260 | 122 |
6 | 2014-12-08 | 205 | 5호선 | 2534 | 광화문 | 22:00:00~22:59:59 | 30 | 977 | 175 |
7 | 2014-02-16 | 208 | 8호선 | 2814 | 몽촌토성 | 16:00:00~16:59:59 | 0 | 240 | 218 |
8 | 2014-02-03 | 2 | 2호선 | 213 | 구의 | 05:00:00~05:59:59 | 0 | 116 | 0 |
9 | 2014-08-01 | 2 | 2호선 | 237 | 당산 | 21:00:00~21:59:59 | 0 | 320 | 475 |
운행일자 | 호선ID | 호선 | 역ID | 역 | 승차시간구분 | 30분시간구간ID | 승차총승객수 | 하차총승객수 | |
---|---|---|---|---|---|---|---|---|---|
490 | 2015-04-19 | 207 | 7호선 | 2749 | 철산 | 20:00:00~20:59:59 | 0 | 397 | 1252 |
491 | 2015-06-14 | 1 | 1호선 | 151 | 시청 | 19:00:00~19:59:59 | 0 | 1957 | 491 |
492 | 2014-12-31 | 3 | 3호선 | 329 | 고속터미널 | 22:00:00~22:59:59 | 0 | 1144 | 633 |
493 | 2015-10-27 | 205 | 5호선 | 2548 | 천호 | 15:00:00~15:59:59 | 30 | 648 | 687 |
494 | 2014-05-25 | 206 | 6호선 | 2616 | 구산 | 13:00:00~13:59:59 | 30 | 194 | 114 |
495 | 2015-07-19 | 207 | 7호선 | 2735 | 반포 | 22:00:00~22:59:59 | 0 | 190 | 69 |
496 | 2015-06-07 | 205 | 5호선 | 2559 | 개롱 | 17:00:00~17:59:59 | 0 | 115 | 214 |
497 | 2015-01-24 | 404 | 9호선2단계 | 4126 | 언주 | 06:00:00~06:59:59 | 0 | 25 | 79 |
498 | 2014-07-28 | 207 | 7호선 | 2727 | 군자 | 17:00:00~17:59:59 | 30 | 316 | 321 |
499 | 2014-07-24 | 206 | 6호선 | 2640 | 안암 | 17:00:00~17:59:59 | 0 | 954 | 392 |