Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 140 |
Missing cells | 60 |
Missing cells (%) | 10.7% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 4.9 KiB |
Average record size in memory | 35.9 B |
Variable types
Text | 1 |
---|---|
Numeric | 3 |
Dataset
Description | 새마을 하행 역별 승차, 하차 등 실적 입니다. |
---|---|
Author | 한국철도공사 |
URL | https://www.data.go.kr/data/15068474/fileData.do |
승차 is highly overall correlated with 하차 and 1 other fields | High correlation |
하차 is highly overall correlated with 승차 and 1 other fields | High correlation |
인키로 is highly overall correlated with 승차 and 1 other fields | High correlation |
승차 has 20 (14.3%) missing values | Missing |
하차 has 20 (14.3%) missing values | Missing |
인키로 has 20 (14.3%) missing values | Missing |
역명 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 02:11:03.828484 |
---|---|
Analysis finished | 2023-12-12 02:11:05.860210 |
Duration | 2.03 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
역명
Text
UNIQUE
 
Distinct | 140 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.2 KiB |
Value | Count | Frequency (%) |
가평 | 1 | 0.7% |
오근장 | 1 | 0.7% |
전곡 | 1 | 0.7% |
장항 | 1 | 0.7% |
장성 | 1 | 0.7% |
임진강 | 1 | 0.7% |
익산 | 1 | 0.7% |
의정부 | 1 | 0.7% |
의성 | 1 | 0.7% |
예천 | 1 | 0.7% |
Other values (130) | 130 |
Most occurring characters
Value | Count | Frequency (%) |
천 | 21 | 6.6% |
산 | 14 | 4.4% |
주 | 12 | 3.8% |
동 | 8 | 2.5% |
원 | 8 | 2.5% |
평 | 7 | 2.2% |
성 | 7 | 2.2% |
양 | 7 | 2.2% |
영 | 6 | 1.9% |
진 | 6 | 1.9% |
Other values (124) | 223 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 319 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
천 | 21 | 6.6% |
산 | 14 | 4.4% |
주 | 12 | 3.8% |
동 | 8 | 2.5% |
원 | 8 | 2.5% |
평 | 7 | 2.2% |
성 | 7 | 2.2% |
양 | 7 | 2.2% |
영 | 6 | 1.9% |
진 | 6 | 1.9% |
Other values (124) | 223 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 319 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
천 | 21 | 6.6% |
산 | 14 | 4.4% |
주 | 12 | 3.8% |
동 | 8 | 2.5% |
원 | 8 | 2.5% |
평 | 7 | 2.2% |
성 | 7 | 2.2% |
양 | 7 | 2.2% |
영 | 6 | 1.9% |
진 | 6 | 1.9% |
Other values (124) | 223 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 319 |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
천 | 21 | 6.6% |
산 | 14 | 4.4% |
주 | 12 | 3.8% |
동 | 8 | 2.5% |
원 | 8 | 2.5% |
평 | 7 | 2.2% |
성 | 7 | 2.2% |
양 | 7 | 2.2% |
영 | 6 | 1.9% |
진 | 6 | 1.9% |
Other values (124) | 223 |
승차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 116 |
---|---|
Distinct (%) | 96.7% |
Missing | 20 |
Missing (%) | 14.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 10981.425 |
Minimum | 2 |
---|---|
Maximum | 264515 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.4 KiB |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 9.95 |
Q1 | 158.75 |
median | 557 |
Q3 | 3269.25 |
95-th percentile | 46265.05 |
Maximum | 264515 |
Range | 264513 |
Interquartile range (IQR) | 3110.5 |
Descriptive statistics
Standard deviation | 40105.48 |
---|---|
Coefficient of variation (CV) | 3.6521198 |
Kurtosis | 28.143808 |
Mean | 10981.425 |
Median Absolute Deviation (MAD) | 528.5 |
Skewness | 5.2323423 |
Sum | 1317771 |
Variance | 1.6084495 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6 | 2 | 1.4% |
9 | 2 | 1.4% |
269 | 2 | 1.4% |
2 | 2 | 1.4% |
4717 | 1 | 0.7% |
5993 | 1 | 0.7% |
34 | 1 | 0.7% |
380 | 1 | 0.7% |
6161 | 1 | 0.7% |
861 | 1 | 0.7% |
Other values (106) | 106 | |
(Missing) | 20 | 14.3% |
Value | Count | Frequency (%) |
2 | 2 | |
6 | 2 | |
9 | 2 | |
10 | 1 | |
11 | 1 | |
15 | 1 | |
18 | 1 | |
19 | 1 | |
24 | 1 | |
27 | 1 |
Value | Count | Frequency (%) |
264515 | 1 | |
247151 | 1 | |
215176 | 1 | |
120300 | 1 | |
66330 | 1 | |
46817 | 1 | |
46236 | 1 | |
42839 | 1 | |
35089 | 1 | |
27089 | 1 |
하차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 118 |
---|---|
Distinct (%) | 98.3% |
Missing | 20 |
Missing (%) | 14.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 10981.425 |
Minimum | 2 |
---|---|
Maximum | 162739 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.4 KiB |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 29.2 |
Q1 | 333.75 |
median | 938 |
Q3 | 3295.5 |
95-th percentile | 66185.75 |
Maximum | 162739 |
Range | 162737 |
Interquartile range (IQR) | 2961.75 |
Descriptive statistics
Standard deviation | 27744.848 |
---|---|
Coefficient of variation (CV) | 2.5265253 |
Kurtosis | 14.276541 |
Mean | 10981.425 |
Median Absolute Deviation (MAD) | 765 |
Skewness | 3.6071349 |
Sum | 1317771 |
Variance | 7.6977661 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
335 | 2 | 1.4% |
686 | 2 | 1.4% |
2054 | 1 | 0.7% |
16030 | 1 | 0.7% |
192 | 1 | 0.7% |
22210 | 1 | 0.7% |
197 | 1 | 0.7% |
1130 | 1 | 0.7% |
79785 | 1 | 0.7% |
437 | 1 | 0.7% |
Other values (108) | 108 | |
(Missing) | 20 | 14.3% |
Value | Count | Frequency (%) |
2 | 1 | |
5 | 1 | |
9 | 1 | |
12 | 1 | |
13 | 1 | |
14 | 1 | |
30 | 1 | |
36 | 1 | |
39 | 1 | |
43 | 1 |
Value | Count | Frequency (%) |
162739 | 1 | |
156560 | 1 | |
96562 | 1 | |
95639 | 1 | |
82788 | 1 | |
79785 | 1 | |
65470 | 1 | |
64966 | 1 | |
58917 | 1 | |
55406 | 1 |
인키로
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 120 |
---|---|
Distinct (%) | 100.0% |
Missing | 20 |
Missing (%) | 14.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1512584.8 |
Minimum | 161 |
---|---|
Maximum | 32592545 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.4 KiB |
Quantile statistics
Minimum | 161 |
---|---|
5-th percentile | 3019.8 |
Q1 | 64831.5 |
median | 195457 |
Q3 | 576723.75 |
95-th percentile | 7610851 |
Maximum | 32592545 |
Range | 32592384 |
Interquartile range (IQR) | 511892.25 |
Descriptive statistics
Standard deviation | 3939030.5 |
---|---|
Coefficient of variation (CV) | 2.6041717 |
Kurtosis | 34.523384 |
Mean | 1512584.8 |
Median Absolute Deviation (MAD) | 173502 |
Skewness | 5.1933887 |
Sum | 1.8151018 × 108 |
Variance | 1.5515961 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
68674 | 1 | 0.7% |
43622 | 1 | 0.7% |
2817275 | 1 | 0.7% |
44830 | 1 | 0.7% |
29791 | 1 | 0.7% |
6226435 | 1 | 0.7% |
181929 | 1 | 0.7% |
79352 | 1 | 0.7% |
182001 | 1 | 0.7% |
2401519 | 1 | 0.7% |
Other values (110) | 110 | |
(Missing) | 20 | 14.3% |
Value | Count | Frequency (%) |
161 | 1 | |
390 | 1 | |
400 | 1 | |
833 | 1 | |
1788 | 1 | |
2028 | 1 | |
3072 | 1 | |
3495 | 1 | |
4120 | 1 | |
6720 | 1 |
Value | Count | Frequency (%) |
32592545 | 1 | |
16964026 | 1 | |
13793434 | 1 | |
8925361 | 1 | |
8500276 | 1 | |
8034133 | 1 | |
7588573 | 1 | |
7423940 | 1 | |
6226435 | 1 | |
5479949 | 1 |
승차 | 하차 | 인키로 | |
---|---|---|---|
승차 | 1.000 | 0.682 | 0.758 |
하차 | 0.682 | 1.000 | 0.841 |
인키로 | 0.758 | 0.841 | 1.000 |
승차 | 하차 | 인키로 | |
---|---|---|---|
승차 | 1.000 | 0.583 | 0.559 |
하차 | 0.583 | 1.000 | 0.894 |
인키로 | 0.559 | 0.894 | 1.000 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
0 | 가평 | 315 | <NA> | <NA> |
1 | 간석 | 382 | <NA> | <NA> |
2 | 경산 | <NA> | 2 | 400 |
3 | 경주 | 10 | 346 | 89447 |
4 | 곡성 | 986 | 4161 | 488286 |
5 | 광양 | 182 | 992 | 365383 |
6 | 광주송정 | 178 | 1428 | 330992 |
7 | 광천 | 5379 | 44913 | 4557922 |
8 | 구례구 | 286 | 1268 | 318511 |
9 | 구미 | 524 | 721 | 143908 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
130 | 포항 | 560 | 1866 | 450846 |
131 | 풍기 | 152 | 1652 | 373561 |
132 | 하동 | 470 | 1928 | 250126 |
133 | 하양 | <NA> | 30 | 6720 |
134 | 함안 | <NA> | 330 | 80441 |
135 | 함평 | 9 | 337 | 60645 |
136 | 호계 | 29 | <NA> | <NA> |
137 | 홍성 | 27089 | 156560 | 13793434 |
138 | 화명 | 49 | <NA> | <NA> |
139 | 화본 | 341 | 341 | 76214 |