Dataset statistics
Number of variables | 5 |
---|---|
Number of observations | 276 |
Missing cells | 662 |
Missing cells (%) | 48.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 12.0 KiB |
Average record size in memory | 44.5 B |
Variable types
Text | 1 |
---|---|
Numeric | 4 |
Dataset
Description | 역별 하차 여객수송(고속열차, 새마을 등) 실적 입니다. |
---|---|
Author | 한국철도공사 |
URL | https://www.data.go.kr/data/15068458/fileData.do |
새마을 is highly overall correlated with 무궁화 | High correlation |
무궁화 is highly overall correlated with 새마을 and 1 other fields | High correlation |
통근열차 is highly overall correlated with 무궁화 | High correlation |
고속열차 has 225 (81.5%) missing values | Missing |
새마을 has 128 (46.4%) missing values | Missing |
무궁화 has 43 (15.6%) missing values | Missing |
통근열차 has 266 (96.4%) missing values | Missing |
역 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 23:49:23.141421 |
---|---|
Analysis finished | 2023-12-12 23:49:25.215781 |
Duration | 2.07 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
역
Text
UNIQUE
 
Distinct | 276 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.3 KiB |
Value | Count | Frequency (%) |
서울 | 1 | 0.4% |
득량 | 1 | 0.4% |
진상 | 1 | 0.4% |
광양 | 1 | 0.4% |
벌교 | 1 | 0.4% |
조성 | 1 | 0.4% |
예당 | 1 | 0.4% |
순천 | 1 | 0.4% |
명봉 | 1 | 0.4% |
구례구 | 1 | 0.4% |
Other values (266) | 266 |
Most occurring characters
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.8% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (184) | 470 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 630 | |
Decimal Number | 1 | 0.2% |
Uppercase Letter | 1 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
Decimal Number
Value | Count | Frequency (%) |
2 | 1 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 630 | |
Common | 1 | 0.2% |
Latin | 1 | 0.2% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
Common
Value | Count | Frequency (%) |
2 | 1 |
Latin
Value | Count | Frequency (%) |
T | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 630 | |
ASCII | 2 | 0.3% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
천 | 31 | 4.9% |
산 | 22 | 3.5% |
동 | 19 | 3.0% |
주 | 18 | 2.9% |
신 | 14 | 2.2% |
양 | 13 | 2.1% |
성 | 13 | 2.1% |
원 | 12 | 1.9% |
정 | 10 | 1.6% |
리 | 10 | 1.6% |
Other values (182) | 468 |
ASCII
Value | Count | Frequency (%) |
2 | 1 | |
T | 1 |
고속열차
Real number (ℝ)
MISSING
 
Distinct | 51 |
---|---|
Distinct (%) | 100.0% |
Missing | 225 |
Missing (%) | 81.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1314533.2 |
Minimum | 13990 |
---|---|
Maximum | 14181231 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 13990 |
---|---|
5-th percentile | 39147.5 |
Q1 | 117435.5 |
median | 390867 |
Q3 | 979533.5 |
95-th percentile | 5439788.5 |
Maximum | 14181231 |
Range | 14167241 |
Interquartile range (IQR) | 862098 |
Descriptive statistics
Standard deviation | 2427984.5 |
---|---|
Coefficient of variation (CV) | 1.8470317 |
Kurtosis | 15.554503 |
Mean | 1314533.2 |
Median Absolute Deviation (MAD) | 315990 |
Skewness | 3.5361466 |
Sum | 67041194 |
Variance | 5.8951089 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5137021 | 1 | 0.4% |
1333029 | 1 | 0.4% |
287237 | 1 | 0.4% |
858979 | 1 | 0.4% |
173569 | 1 | 0.4% |
42715 | 1 | 0.4% |
35964 | 1 | 0.4% |
691601 | 1 | 0.4% |
212995 | 1 | 0.4% |
527737 | 1 | 0.4% |
Other values (41) | 41 | 14.9% |
(Missing) | 225 |
Value | Count | Frequency (%) |
13990 | 1 | |
20116 | 1 | |
35964 | 1 | |
42331 | 1 | |
42715 | 1 | |
54981 | 1 | |
74877 | 1 | |
83105 | 1 | |
83141 | 1 | |
92767 | 1 |
Value | Count | Frequency (%) |
14181231 | 1 | |
6051142 | 1 | |
5742556 | 1 | |
5137021 | 1 | |
4903041 | 1 | |
4738698 | 1 | |
3277052 | 1 | |
2743629 | 1 | |
2129478 | 1 | |
2123859 | 1 |
새마을
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 145 |
---|---|
Distinct (%) | 98.0% |
Missing | 128 |
Missing (%) | 46.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 72648.081 |
Minimum | 2 |
---|---|
Maximum | 1376772 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 44.45 |
Q1 | 553.75 |
median | 5388.5 |
Q3 | 44679.25 |
95-th percentile | 464317.2 |
Maximum | 1376772 |
Range | 1376770 |
Interquartile range (IQR) | 44125.5 |
Descriptive statistics
Standard deviation | 181622.89 |
---|---|
Coefficient of variation (CV) | 2.500037 |
Kurtosis | 21.906715 |
Mean | 72648.081 |
Median Absolute Deviation (MAD) | 5329 |
Skewness | 4.2244665 |
Sum | 10751916 |
Variance | 3.2986875 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 2 | 0.7% |
335 | 2 | 0.7% |
63 | 2 | 0.7% |
75283 | 1 | 0.4% |
54527 | 1 | 0.4% |
215199 | 1 | 0.4% |
664630 | 1 | 0.4% |
2218 | 1 | 0.4% |
27180 | 1 | 0.4% |
443135 | 1 | 0.4% |
Other values (135) | 135 | |
(Missing) | 128 |
Value | Count | Frequency (%) |
2 | 2 | |
5 | 1 | |
11 | 1 | |
24 | 1 | |
30 | 1 | |
35 | 1 | |
42 | 1 | |
49 | 1 | |
56 | 1 | |
63 | 2 |
Value | Count | Frequency (%) |
1376772 | 1 | |
910171 | 1 | |
674079 | 1 | |
664630 | 1 | |
545770 | 1 | |
536832 | 1 | |
508801 | 1 | |
475723 | 1 | |
443135 | 1 | |
367397 | 1 |
무궁화
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 233 |
---|---|
Distinct (%) | 100.0% |
Missing | 43 |
Missing (%) | 15.6% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 242927.06 |
Minimum | 8 |
---|---|
Maximum | 4416588 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 8 |
---|---|
5-th percentile | 480.6 |
Q1 | 6301 |
median | 38331 |
Q3 | 146009 |
95-th percentile | 1505355 |
Maximum | 4416588 |
Range | 4416580 |
Interquartile range (IQR) | 139708 |
Descriptive statistics
Standard deviation | 582591.1 |
---|---|
Coefficient of variation (CV) | 2.3982141 |
Kurtosis | 18.946297 |
Mean | 242927.06 |
Median Absolute Deviation (MAD) | 36758 |
Skewness | 4.0434009 |
Sum | 56602006 |
Variance | 3.394124 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4378 | 1 | 0.4% |
32951 | 1 | 0.4% |
1305 | 1 | 0.4% |
696 | 1 | 0.4% |
771043 | 1 | 0.4% |
1919822 | 1 | 0.4% |
16924 | 1 | 0.4% |
65691 | 1 | 0.4% |
876399 | 1 | 0.4% |
3377 | 1 | 0.4% |
Other values (223) | 223 | |
(Missing) | 43 | 15.6% |
Value | Count | Frequency (%) |
8 | 1 | |
188 | 1 | |
220 | 1 | |
279 | 1 | |
325 | 1 | |
335 | 1 | |
400 | 1 | |
410 | 1 | |
430 | 1 | |
454 | 1 |
Value | Count | Frequency (%) |
4416588 | 1 | |
3442822 | 1 | |
2854646 | 1 | |
2407004 | 1 | |
2363879 | 1 | |
2354296 | 1 | |
2342738 | 1 | |
2173175 | 1 | |
1919822 | 1 | |
1668314 | 1 |
통근열차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 266 |
Missing (%) | 96.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 38969.3 |
Minimum | 130 |
---|---|
Maximum | 186309 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.6 KiB |
Quantile statistics
Minimum | 130 |
---|---|
5-th percentile | 148 |
Q1 | 1231 |
median | 2285 |
Q3 | 59225 |
95-th percentile | 153388.35 |
Maximum | 186309 |
Range | 186179 |
Interquartile range (IQR) | 57994 |
Descriptive statistics
Standard deviation | 65259.971 |
---|---|
Coefficient of variation (CV) | 1.6746508 |
Kurtosis | 1.883231 |
Mean | 38969.3 |
Median Absolute Deviation (MAD) | 2135 |
Skewness | 1.6505504 |
Sum | 389693 |
Variance | 4.2588638 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
186309 | 1 | 0.4% |
2163 | 1 | 0.4% |
130 | 1 | 0.4% |
170 | 1 | 0.4% |
5333 | 1 | 0.4% |
77189 | 1 | 0.4% |
1042 | 1 | 0.4% |
1798 | 1 | 0.4% |
2407 | 1 | 0.4% |
113152 | 1 | 0.4% |
(Missing) | 266 |
Value | Count | Frequency (%) |
130 | 1 | |
170 | 1 | |
1042 | 1 | |
1798 | 1 | |
2163 | 1 | |
2407 | 1 | |
5333 | 1 | |
77189 | 1 | |
113152 | 1 | |
186309 | 1 |
Value | Count | Frequency (%) |
186309 | 1 | |
113152 | 1 | |
77189 | 1 | |
5333 | 1 | |
2407 | 1 | |
2163 | 1 | |
1798 | 1 | |
1042 | 1 | |
170 | 1 | |
130 | 1 |
고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|
고속열차 | 1.000 | 0.502 | 0.683 | NaN |
새마을 | 0.502 | 1.000 | 0.893 | NaN |
무궁화 | 0.683 | 0.893 | 1.000 | NaN |
통근열차 | NaN | NaN | NaN | 1.000 |
고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|
고속열차 | 1.000 | 0.415 | 0.448 | NaN |
새마을 | 0.415 | 1.000 | 0.635 | 0.486 |
무궁화 | 0.448 | 0.635 | 1.000 | 1.000 |
통근열차 | NaN | 0.486 | 1.000 | 1.000 |
역 | 고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|---|
0 | 서울 | 14181231 | 545770 | 2363879 | <NA> |
1 | 용산 | 5137021 | 508801 | 1534713 | <NA> |
2 | 수색 | <NA> | <NA> | 1315 | <NA> |
3 | 행신 | 721378 | <NA> | <NA> | <NA> |
4 | 문산 | <NA> | 1598 | 814 | <NA> |
5 | 운천 | <NA> | 5 | <NA> | <NA> |
6 | 임진강 | <NA> | 1492 | 864 | <NA> |
7 | 도라산 | <NA> | 20781 | <NA> | <NA> |
8 | 인천공항 | 74877 | <NA> | <NA> | <NA> |
9 | 검암 | 20116 | <NA> | <NA> | <NA> |
역 | 고속열차 | 새마을 | 무궁화 | 통근열차 | |
---|---|---|---|---|---|
266 | 완사 | <NA> | <NA> | 3886 | <NA> |
267 | 북천 | <NA> | 2068 | 15590 | <NA> |
268 | 양보 | <NA> | <NA> | <NA> | <NA> |
269 | 횡천 | <NA> | <NA> | 7168 | <NA> |
270 | 하동 | <NA> | 2225 | 31535 | <NA> |
271 | 울산 | 2129478 | <NA> | <NA> | <NA> |
272 | 진례 | <NA> | <NA> | 3473 | <NA> |
273 | 창원중앙 | 775167 | 45363 | 249890 | <NA> |
274 | 가야 | <NA> | <NA> | <NA> | <NA> |
275 | 신해운대 | <NA> | 36695 | 190735 | <NA> |