Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 59 |
Missing cells | 13 |
Missing cells (%) | 5.5% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.1 KiB |
Average record size in memory | 37.2 B |
Variable types
Text | 1 |
---|---|
Numeric | 3 |
Dataset
Description | 역별 ITX 새마을 하행 여객 승하차 실적 입니다. |
---|---|
Author | 한국철도공사 |
URL | https://www.data.go.kr/data/15068477/fileData.do |
승차 is highly overall correlated with 하차 and 1 other fields | High correlation |
하차 is highly overall correlated with 승차 and 1 other fields | High correlation |
인키로 is highly overall correlated with 승차 and 1 other fields | High correlation |
승차 has 7 (11.9%) missing values | Missing |
하차 has 3 (5.1%) missing values | Missing |
인키로 has 3 (5.1%) missing values | Missing |
역명 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 16:42:57.839774 |
---|---|
Analysis finished | 2023-12-12 16:42:59.190046 |
Duration | 1.35 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
역명
Text
UNIQUE
 
Distinct | 59 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 604.0 B |
Value | Count | Frequency (%) |
강경 | 1 | 1.7% |
양평 | 1 | 1.7% |
여천 | 1 | 1.7% |
영동 | 1 | 1.7% |
영등포 | 1 | 1.7% |
영주 | 1 | 1.7% |
왜관 | 1 | 1.7% |
용문 | 1 | 1.7% |
용산 | 1 | 1.7% |
원동 | 1 | 1.7% |
Other values (49) | 49 |
Most occurring characters
Value | Count | Frequency (%) |
주 | 7 | 5.2% |
원 | 7 | 5.2% |
산 | 6 | 4.5% |
구 | 6 | 4.5% |
대 | 5 | 3.7% |
천 | 5 | 3.7% |
포 | 4 | 3.0% |
영 | 4 | 3.0% |
전 | 4 | 3.0% |
평 | 3 | 2.2% |
Other values (62) | 83 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 134 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
주 | 7 | 5.2% |
원 | 7 | 5.2% |
산 | 6 | 4.5% |
구 | 6 | 4.5% |
대 | 5 | 3.7% |
천 | 5 | 3.7% |
포 | 4 | 3.0% |
영 | 4 | 3.0% |
전 | 4 | 3.0% |
평 | 3 | 2.2% |
Other values (62) | 83 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 134 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
주 | 7 | 5.2% |
원 | 7 | 5.2% |
산 | 6 | 4.5% |
구 | 6 | 4.5% |
대 | 5 | 3.7% |
천 | 5 | 3.7% |
포 | 4 | 3.0% |
영 | 4 | 3.0% |
전 | 4 | 3.0% |
평 | 3 | 2.2% |
Other values (62) | 83 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 134 |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
주 | 7 | 5.2% |
원 | 7 | 5.2% |
산 | 6 | 4.5% |
구 | 6 | 4.5% |
대 | 5 | 3.7% |
천 | 5 | 3.7% |
포 | 4 | 3.0% |
영 | 4 | 3.0% |
전 | 4 | 3.0% |
평 | 3 | 2.2% |
Other values (62) | 83 |
승차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 52 |
---|---|
Distinct (%) | 100.0% |
Missing | 7 |
Missing (%) | 11.9% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 75749 |
Minimum | 52 |
---|---|
Maximum | 652819 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 663.0 B |
Quantile statistics
Minimum | 52 |
---|---|
5-th percentile | 175 |
Q1 | 1659.25 |
median | 7837.5 |
Q3 | 72483.75 |
95-th percentile | 425197.35 |
Maximum | 652819 |
Range | 652767 |
Interquartile range (IQR) | 70824.5 |
Descriptive statistics
Standard deviation | 149920.56 |
---|---|
Coefficient of variation (CV) | 1.9791755 |
Kurtosis | 6.9364556 |
Mean | 75749 |
Median Absolute Deviation (MAD) | 7432 |
Skewness | 2.6703941 |
Sum | 3938948 |
Variance | 2.2476175 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
131 | 1 | 1.7% |
614924 | 1 | 1.7% |
15924 | 1 | 1.7% |
1726 | 1 | 1.7% |
282977 | 1 | 1.7% |
1070 | 1 | 1.7% |
6949 | 1 | 1.7% |
97240 | 1 | 1.7% |
52 | 1 | 1.7% |
3371 | 1 | 1.7% |
Other values (42) | 42 | |
(Missing) | 7 | 11.9% |
Value | Count | Frequency (%) |
52 | 1 | |
124 | 1 | |
131 | 1 | |
211 | 1 | |
261 | 1 | |
309 | 1 | |
502 | 1 | |
965 | 1 | |
1022 | 1 | |
1070 | 1 |
Value | Count | Frequency (%) |
652819 | 1 | |
614924 | 1 | |
465338 | 1 | |
392355 | 1 | |
282977 | 1 | |
202350 | 1 | |
189137 | 1 | |
172449 | 1 | |
164732 | 1 | |
113944 | 1 |
하차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 56 |
---|---|
Distinct (%) | 100.0% |
Missing | 3 |
Missing (%) | 5.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 70338.357 |
Minimum | 35 |
---|---|
Maximum | 371786 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 663.0 B |
Quantile statistics
Minimum | 35 |
---|---|
5-th percentile | 3125.5 |
Q1 | 9973.75 |
median | 29067.5 |
Q3 | 75821.5 |
95-th percentile | 277075.25 |
Maximum | 371786 |
Range | 371751 |
Interquartile range (IQR) | 65847.75 |
Descriptive statistics
Standard deviation | 95944.951 |
---|---|
Coefficient of variation (CV) | 1.3640488 |
Kurtosis | 2.3873251 |
Mean | 70338.357 |
Median Absolute Deviation (MAD) | 20599 |
Skewness | 1.81144 |
Sum | 3938948 |
Variance | 9.2054336 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
6609 | 1 | 1.7% |
11985 | 1 | 1.7% |
60216 | 1 | 1.7% |
1187 | 1 | 1.7% |
14307 | 1 | 1.7% |
9070 | 1 | 1.7% |
3449 | 1 | 1.7% |
35 | 1 | 1.7% |
24199 | 1 | 1.7% |
103937 | 1 | 1.7% |
Other values (46) | 46 | |
(Missing) | 3 | 5.1% |
Value | Count | Frequency (%) |
35 | 1 | |
1187 | 1 | |
2155 | 1 | |
3449 | 1 | |
5165 | 1 | |
5925 | 1 | |
6117 | 1 | |
6609 | 1 | |
8189 | 1 | |
8748 | 1 |
Value | Count | Frequency (%) |
371786 | 1 | |
359589 | 1 | |
307355 | 1 | |
266982 | 1 | |
265317 | 1 | |
218381 | 1 | |
209910 | 1 | |
202642 | 1 | |
196250 | 1 | |
159109 | 1 |
인키로
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 56 |
---|---|
Distinct (%) | 100.0% |
Missing | 3 |
Missing (%) | 5.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 9306861.4 |
Minimum | 25062 |
---|---|
Maximum | 54443135 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 663.0 B |
Quantile statistics
Minimum | 25062 |
---|---|
5-th percentile | 253027 |
Q1 | 1241776.8 |
median | 4854909 |
Q3 | 10036954 |
95-th percentile | 36116554 |
Maximum | 54443135 |
Range | 54418073 |
Interquartile range (IQR) | 8795177.2 |
Descriptive statistics
Standard deviation | 12699324 |
---|---|
Coefficient of variation (CV) | 1.3645119 |
Kurtosis | 4.5485541 |
Mean | 9306861.4 |
Median Absolute Deviation (MAD) | 3704099.5 |
Skewness | 2.1971654 |
Sum | 5.2118424 × 108 |
Variance | 1.6127282 × 1014 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
157032 | 1 | 1.7% |
3955950 | 1 | 1.7% |
7170204 | 1 | 1.7% |
164503 | 1 | 1.7% |
1687717 | 1 | 1.7% |
806447 | 1 | 1.7% |
282535 | 1 | 1.7% |
25062 | 1 | 1.7% |
2217838 | 1 | 1.7% |
10506793 | 1 | 1.7% |
Other values (46) | 46 | |
(Missing) | 3 | 5.1% |
Value | Count | Frequency (%) |
25062 | 1 | |
157032 | 1 | |
164503 | 1 | |
282535 | 1 | |
530876 | 1 | |
685418 | 1 | |
806447 | 1 | |
834550 | 1 | |
918991 | 1 | |
951555 | 1 |
Value | Count | Frequency (%) |
54443135 | 1 | |
50741736 | 1 | |
46888154 | 1 | |
32526020 | 1 | |
30612428 | 1 | |
26020783 | 1 | |
24169617 | 1 | |
23750141 | 1 | |
22659402 | 1 | |
15224689 | 1 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
역명 | 1.000 | 1.000 | 1.000 | 1.000 |
승차 | 1.000 | 1.000 | 0.779 | 0.706 |
하차 | 1.000 | 0.779 | 1.000 | 0.856 |
인키로 | 1.000 | 0.706 | 0.856 | 1.000 |
승차 | 하차 | 인키로 | |
---|---|---|---|
승차 | 1.000 | 0.772 | 0.707 |
하차 | 0.772 | 1.000 | 0.959 |
인키로 | 0.707 | 0.959 | 1.000 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
0 | 강경 | 2527 | 18335 | 1569804 |
1 | 경산 | 24001 | 30138 | 4104249 |
2 | 계룡 | 8964 | 36683 | 5476976 |
3 | 곡성 | 2008 | 9249 | 1202920 |
4 | 광주 | <NA> | 75283 | 13942901 |
5 | 광주송정 | 8726 | 31984 | 5468174 |
6 | 구례구 | 1022 | 5165 | 1254729 |
7 | 구미 | 392355 | 209910 | 22659402 |
8 | 구포 | 5160 | 202642 | 32526020 |
9 | 김제 | 14700 | 49359 | 6462663 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
49 | 진주 | <NA> | 16164 | 4629693 |
50 | 창원 | 1200 | 14958 | 3139901 |
51 | 창원중앙 | 3981 | 32543 | 6297130 |
52 | 천안 | 189137 | 266982 | 26020783 |
53 | 청도 | 5940 | 11909 | 959543 |
54 | 청량리 | 66664 | <NA> | <NA> |
55 | 평택 | 89943 | 159109 | 15224689 |
56 | 풍기 | 211 | 9108 | 965992 |
57 | 함안 | 309 | 6117 | 834550 |
58 | 함평 | 261 | 8189 | 1640924 |