Dataset statistics
Number of variables | 5 |
---|---|
Number of observations | 276 |
Missing cells | 661 |
Missing cells (%) | 47.9% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 12.0 KiB |
Average record size in memory | 44.5 B |
Variable types
Text | 1 |
---|---|
Numeric | 4 |
Dataset
Description | 역별 승차 여객수송(고속열차, 새마을 등) 실적 입니다. |
---|---|
Author | 한국철도공사 |
URL | https://www.data.go.kr/data/15068456/fileData.do |
새마을 is highly overall correlated with 무궁화 and 1 other fields | High correlation |
무궁화 is highly overall correlated with 새마을 and 1 other fields | High correlation |
통근열차 is highly overall correlated with 새마을 and 1 other fields | High correlation |
고속열차 has 225 (81.5%) missing values | Missing |
새마을 has 128 (46.4%) missing values | Missing |
무궁화 has 42 (15.2%) missing values | Missing |
통근열차 has 266 (96.4%) missing values | Missing |
역 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 17:13:29.686550 |
---|---|
Analysis finished | 2023-12-12 17:13:32.674007 |
Duration | 2.99 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
역
Text
UNIQUE
 
Distinct | 276 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
서울 | 1 | 0.4% |
득량 | 1 | 0.4% |
진상 | 1 | 0.4% |
광양 | 1 | 0.4% |
벌교 | 1 | 0.4% |
조성 | 1 | 0.4% |
예당 | 1 | 0.4% |
순천 | 1 | 0.4% |
명봉 | 1 | 0.4% |
구례구 | 1 | 0.4% |
Other values (266) | 266 |
Most occurring characters
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.8% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (184) | 470 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 630 | |
Decimal Number | 1 | 0.2% |
Uppercase Letter | 1 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
Decimal Number
Value | Count | Frequency (%) |
2 | 1 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 630 | |
Common | 1 | 0.2% |
Latin | 1 | 0.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
Common
Value | Count | Frequency (%) |
2 | 1 |
Latin
Value | Count | Frequency (%) |
T | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 630 | |
ASCII | 2 | 0.3% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
ASCII
Value | Count | Frequency (%) |
2 | 1 | |
T | 1 |
고속열차
Real number (ℝ)
MISSING
 
Distinct | 51 |
---|---|
Distinct (%) | 100.0% |
Missing | 225 |
Missing (%) | 81.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1314533.2 |
Minimum | 14696 |
---|---|
Maximum | 13821606 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 14696 |
---|---|
5-th percentile | 36520 |
Q1 | 124899 |
median | 408956 |
Q3 | 1035383.5 |
95-th percentile | 5375036 |
Maximum | 13821606 |
Range | 13806910 |
Interquartile range (IQR) | 910484.5 |
Descriptive statistics
Standard deviation | 2385772.9 |
---|---|
Coefficient of variation (CV) | 1.8149202 |
Kurtosis | 14.858069 |
Mean | 1314533.2 |
Median Absolute Deviation (MAD) | 323527 |
Skewness | 3.4548526 |
Sum | 67041194 |
Variance | 5.6919121 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4923025 | 1 | 0.4% |
1372368 | 1 | 0.4% |
292509 | 1 | 0.4% |
883019 | 1 | 0.4% |
175482 | 1 | 0.4% |
37485 | 1 | 0.4% |
35555 | 1 | 0.4% |
692023 | 1 | 0.4% |
220116 | 1 | 0.4% |
506540 | 1 | 0.4% |
Other values (41) | 41 | 14.9% |
(Missing) | 225 |
Value | Count | Frequency (%) |
14696 | 1 | |
21738 | 1 | |
35555 | 1 | |
37485 | 1 | |
48959 | 1 | |
52921 | 1 | |
60153 | 1 | |
85429 | 1 | |
86535 | 1 | |
96325 | 1 |
Value | Count | Frequency (%) |
13821606 | 1 | |
6054561 | 1 | |
5827047 | 1 | |
4923025 | 1 | |
4888131 | 1 | |
4696528 | 1 | |
3326931 | 1 | |
2839965 | 1 | |
2136326 | 1 | |
2106885 | 1 |
새마을
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 145 |
---|---|
Distinct (%) | 98.0% |
Missing | 128 |
Missing (%) | 46.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 72648.081 |
Minimum | 2 |
---|---|
Maximum | 1319446 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 48.35 |
Q1 | 619 |
median | 5387.5 |
Q3 | 46115.25 |
95-th percentile | 462855.6 |
Maximum | 1319446 |
Range | 1319444 |
Interquartile range (IQR) | 45496.25 |
Descriptive statistics
Standard deviation | 178721.88 |
---|---|
Coefficient of variation (CV) | 2.4601046 |
Kurtosis | 19.917249 |
Mean | 72648.081 |
Median Absolute Deviation (MAD) | 5330 |
Skewness | 4.0576903 |
Sum | 10751916 |
Variance | 3.194151 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
335 | 3 | 1.1% |
2 | 2 | 0.7% |
463375 | 1 | 0.4% |
655 | 1 | 0.4% |
2626 | 1 | 0.4% |
193365 | 1 | 0.4% |
627179 | 1 | 0.4% |
24830 | 1 | 0.4% |
461891 | 1 | 0.4% |
1366 | 1 | 0.4% |
Other values (135) | 135 | |
(Missing) | 128 |
Value | Count | Frequency (%) |
2 | 2 | |
11 | 1 | |
15 | 1 | |
24 | 1 | |
29 | 1 | |
30 | 1 | |
48 | 1 | |
49 | 1 | |
50 | 1 | |
56 | 1 |
Value | Count | Frequency (%) |
1319446 | 1 | |
881101 | 1 | |
688029 | 1 | |
627179 | 1 | |
603409 | 1 | |
530128 | 1 | |
511593 | 1 | |
463375 | 1 | |
461891 | 1 | |
380759 | 1 |
무궁화
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 233 |
---|---|
Distinct (%) | 99.6% |
Missing | 42 |
Missing (%) | 15.2% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 241888.91 |
Minimum | 188 |
---|---|
Maximum | 4400814 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 188 |
---|---|
5-th percentile | 532.65 |
Q1 | 6525.75 |
median | 36020.5 |
Q3 | 142526.75 |
95-th percentile | 1501243 |
Maximum | 4400814 |
Range | 4400626 |
Interquartile range (IQR) | 136001 |
Descriptive statistics
Standard deviation | 582598.75 |
---|---|
Coefficient of variation (CV) | 2.4085385 |
Kurtosis | 19.199292 |
Mean | 241888.91 |
Median Absolute Deviation (MAD) | 34491 |
Skewness | 4.0694235 |
Sum | 56602006 |
Variance | 3.3942131 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
480 | 2 | 0.7% |
3998 | 1 | 0.4% |
1555 | 1 | 0.4% |
795220 | 1 | 0.4% |
1916700 | 1 | 0.4% |
17292 | 1 | 0.4% |
51521 | 1 | 0.4% |
889205 | 1 | 0.4% |
3831 | 1 | 0.4% |
2339224 | 1 | 0.4% |
Other values (223) | 223 | |
(Missing) | 42 | 15.2% |
Value | Count | Frequency (%) |
188 | 1 | |
321 | 1 | |
339 | 1 | |
361 | 1 | |
400 | 1 | |
405 | 1 | |
410 | 1 | |
430 | 1 | |
454 | 1 | |
464 | 1 |
Value | Count | Frequency (%) |
4400814 | 1 | |
3544084 | 1 | |
2873827 | 1 | |
2475664 | 1 | |
2341165 | 1 | |
2339224 | 1 | |
2281132 | 1 | |
2134671 | 1 | |
1916700 | 1 | |
1610274 | 1 |
통근열차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 266 |
Missing (%) | 96.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 38969.3 |
Minimum | 323 |
---|---|
Maximum | 173913 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 323 |
---|---|
5-th percentile | 397.7 |
Q1 | 1027.75 |
median | 15183.5 |
Q3 | 60966.25 |
95-th percentile | 132040.5 |
Maximum | 173913 |
Range | 173590 |
Interquartile range (IQR) | 59938.5 |
Descriptive statistics
Standard deviation | 55919.551 |
---|---|
Coefficient of variation (CV) | 1.4349642 |
Kurtosis | 3.3666811 |
Mean | 38969.3 |
Median Absolute Deviation (MAD) | 14777.5 |
Skewness | 1.8328648 |
Sum | 389693 |
Variance | 3.1269962 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
173913 | 1 | 0.4% |
8210 | 1 | 0.4% |
489 | 1 | 0.4% |
323 | 1 | 0.4% |
30046 | 1 | 0.4% |
80863 | 1 | 0.4% |
846 | 1 | 0.4% |
1573 | 1 | 0.4% |
22157 | 1 | 0.4% |
71273 | 1 | 0.4% |
(Missing) | 266 |
Value | Count | Frequency (%) |
323 | 1 | |
489 | 1 | |
846 | 1 | |
1573 | 1 | |
8210 | 1 | |
22157 | 1 | |
30046 | 1 | |
71273 | 1 | |
80863 | 1 | |
173913 | 1 |
Value | Count | Frequency (%) |
173913 | 1 | |
80863 | 1 | |
71273 | 1 | |
30046 | 1 | |
22157 | 1 | |
8210 | 1 | |
1573 | 1 | |
846 | 1 | |
489 | 1 | |
323 | 1 |
고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|
고속열차 | 1.000 | 0.693 | 0.612 | NaN |
새마을 | 0.693 | 1.000 | 0.891 | NaN |
무궁화 | 0.612 | 0.891 | 1.000 | NaN |
통근열차 | NaN | NaN | NaN | 1.000 |
고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|
고속열차 | 1.000 | 0.407 | 0.461 | NaN |
새마을 | 0.407 | 1.000 | 0.653 | 0.771 |
무궁화 | 0.461 | 0.653 | 1.000 | 1.000 |
통근열차 | NaN | 0.771 | 1.000 | 1.000 |
역 | 고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|---|
0 | 서울 | 13821606 | 511593 | 2475664 | <NA> |
1 | 용산 | 4923025 | 530128 | 1560979 | <NA> |
2 | 수색 | <NA> | <NA> | 1303 | <NA> |
3 | 행신 | 770568 | <NA> | <NA> | <NA> |
4 | 문산 | <NA> | 1757 | 814 | <NA> |
5 | 운천 | <NA> | 15 | <NA> | <NA> |
6 | 임진강 | <NA> | 1137 | 480 | <NA> |
7 | 도라산 | <NA> | 20431 | <NA> | <NA> |
8 | 인천공항 | 60153 | <NA> | <NA> | <NA> |
9 | 검암 | 21738 | <NA> | <NA> | <NA> |
역 | 고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|---|
266 | 완사 | <NA> | <NA> | 3596 | <NA> |
267 | 북천 | <NA> | 1828 | 15697 | <NA> |
268 | 양보 | <NA> | <NA> | <NA> | <NA> |
269 | 횡천 | <NA> | <NA> | 6741 | <NA> |
270 | 하동 | <NA> | 2671 | 29887 | <NA> |
271 | 울산 | 2136326 | <NA> | <NA> | <NA> |
272 | 진례 | <NA> | <NA> | 3879 | <NA> |
273 | 창원중앙 | 763277 | 60669 | 252290 | <NA> |
274 | 가야 | <NA> | <NA> | <NA> | <NA> |
275 | 신해운대 | <NA> | 31652 | 171642 | <NA> |