Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 236 |
Missing cells | 39 |
Missing cells (%) | 4.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 8.2 KiB |
Average record size in memory | 35.6 B |
Variable types
Text | 1 |
---|---|
Numeric | 3 |
Dataset
Description | 역별 무궁화 하행 여객 승하차 실적 입니다. |
---|---|
Author | 한국철도공사 |
URL | https://www.data.go.kr/data/15068480/fileData.do |
승차 is highly overall correlated with 하차 and 1 other fields | High correlation |
하차 is highly overall correlated with 승차 and 1 other fields | High correlation |
인키로 is highly overall correlated with 승차 and 1 other fields | High correlation |
승차 has 9 (3.8%) missing values | Missing |
하차 has 14 (5.9%) missing values | Missing |
인키로 has 16 (6.8%) missing values | Missing |
역명 has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 06:02:02.115492 |
---|---|
Analysis finished | 2023-12-12 06:02:03.572877 |
Duration | 1.46 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
역명
Text
UNIQUE
 
Distinct | 236 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.0 KiB |
Value | Count | Frequency (%) |
가평 | 1 | 0.4% |
의정부 | 1 | 0.4% |
전곡 | 1 | 0.4% |
용궁 | 1 | 0.4% |
용문 | 1 | 0.4% |
용산 | 1 | 0.4% |
웅천 | 1 | 0.4% |
원동 | 1 | 0.4% |
원주 | 1 | 0.4% |
월내 | 1 | 0.4% |
Other values (226) | 226 |
Most occurring characters
Value | Count | Frequency (%) |
천 | 26 | 4.9% |
동 | 18 | 3.4% |
산 | 17 | 3.2% |
주 | 16 | 3.0% |
양 | 12 | 2.3% |
원 | 12 | 2.3% |
성 | 11 | 2.1% |
신 | 11 | 2.1% |
사 | 8 | 1.5% |
강 | 8 | 1.5% |
Other values (171) | 387 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 526 |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
천 | 26 | 4.9% |
동 | 18 | 3.4% |
산 | 17 | 3.2% |
주 | 16 | 3.0% |
양 | 12 | 2.3% |
원 | 12 | 2.3% |
성 | 11 | 2.1% |
신 | 11 | 2.1% |
사 | 8 | 1.5% |
강 | 8 | 1.5% |
Other values (171) | 387 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 526 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
천 | 26 | 4.9% |
동 | 18 | 3.4% |
산 | 17 | 3.2% |
주 | 16 | 3.0% |
양 | 12 | 2.3% |
원 | 12 | 2.3% |
성 | 11 | 2.1% |
신 | 11 | 2.1% |
사 | 8 | 1.5% |
강 | 8 | 1.5% |
Other values (171) | 387 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 526 |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
천 | 26 | 4.9% |
동 | 18 | 3.4% |
산 | 17 | 3.2% |
주 | 16 | 3.0% |
양 | 12 | 2.3% |
원 | 12 | 2.3% |
성 | 11 | 2.1% |
신 | 11 | 2.1% |
사 | 8 | 1.5% |
강 | 8 | 1.5% |
Other values (171) | 387 |
승차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 224 |
---|---|
Distinct (%) | 98.7% |
Missing | 9 |
Missing (%) | 3.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 125916.1 |
Minimum | 11 |
---|---|
Maximum | 3538377 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 11 |
---|---|
5-th percentile | 373.8 |
Q1 | 1731.5 |
median | 12325 |
Q3 | 57348 |
95-th percentile | 642973.2 |
Maximum | 3538377 |
Range | 3538366 |
Interquartile range (IQR) | 55616.5 |
Descriptive statistics
Standard deviation | 396715.35 |
---|---|
Coefficient of variation (CV) | 3.1506324 |
Kurtosis | 35.005259 |
Mean | 125916.1 |
Median Absolute Deviation (MAD) | 11730 |
Skewness | 5.4469172 |
Sum | 28582955 |
Variance | 1.5738307 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
480 | 2 | 0.8% |
8271 | 2 | 0.8% |
464 | 2 | 0.8% |
28870 | 1 | 0.4% |
1560978 | 1 | 0.4% |
8497 | 1 | 0.4% |
45700 | 1 | 0.4% |
136267 | 1 | 0.4% |
16953 | 1 | 0.4% |
4986 | 1 | 0.4% |
Other values (214) | 214 | |
(Missing) | 9 | 3.8% |
Value | Count | Frequency (%) |
11 | 1 | |
27 | 1 | |
54 | 1 | |
157 | 1 | |
188 | 1 | |
210 | 1 | |
216 | 1 | |
218 | 1 | |
242 | 1 | |
264 | 1 |
Value | Count | Frequency (%) |
3538377 | 1 | |
2475664 | 1 | |
2401883 | 1 | |
1560978 | 1 | |
1488261 | 1 | |
1400512 | 1 | |
1266720 | 1 | |
1132277 | 1 | |
1112067 | 1 | |
703420 | 1 |
하차
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 221 |
---|---|
Distinct (%) | 99.5% |
Missing | 14 |
Missing (%) | 5.9% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 128752.05 |
Minimum | 3 |
---|---|
Maximum | 2105604 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 432 |
Q1 | 3786.75 |
median | 25844 |
Q3 | 89442.5 |
95-th percentile | 542673.95 |
Maximum | 2105604 |
Range | 2105601 |
Interquartile range (IQR) | 85655.75 |
Descriptive statistics
Standard deviation | 305464.43 |
---|---|
Coefficient of variation (CV) | 2.3725015 |
Kurtosis | 18.657911 |
Mean | 128752.05 |
Median Absolute Deviation (MAD) | 24760 |
Skewness | 4.1437176 |
Sum | 28582955 |
Variance | 9.3308518 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
3988 | 2 | 0.8% |
378 | 1 | 0.4% |
190212 | 1 | 0.4% |
906 | 1 | 0.4% |
79409 | 1 | 0.4% |
390 | 1 | 0.4% |
42281 | 1 | 0.4% |
37624 | 1 | 0.4% |
543879 | 1 | 0.4% |
14363 | 1 | 0.4% |
Other values (211) | 211 | |
(Missing) | 14 | 5.9% |
Value | Count | Frequency (%) |
3 | 1 | |
36 | 1 | |
125 | 1 | |
132 | 1 | |
188 | 1 | |
285 | 1 | |
325 | 1 | |
330 | 1 | |
350 | 1 | |
378 | 1 |
Value | Count | Frequency (%) |
2105604 | 1 | |
1728405 | 1 | |
1667795 | 1 | |
1611316 | 1 | |
1532307 | 1 | |
1292506 | 1 | |
1219318 | 1 | |
1045925 | 1 | |
871090 | 1 | |
569717 | 1 |
인키로
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 220 |
---|---|
Distinct (%) | 100.0% |
Missing | 16 |
Missing (%) | 6.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12289912 |
Minimum | 5203 |
---|---|
Maximum | 2.3893947 × 108 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.2 KiB |
Quantile statistics
Minimum | 5203 |
---|---|
5-th percentile | 58285 |
Q1 | 404207.75 |
median | 2379865.5 |
Q3 | 9044927.8 |
95-th percentile | 52554805 |
Maximum | 2.3893947 × 108 |
Range | 2.3893426 × 108 |
Interquartile range (IQR) | 8640720 |
Descriptive statistics
Standard deviation | 29840010 |
---|---|
Coefficient of variation (CV) | 2.4280084 |
Kurtosis | 24.362273 |
Mean | 12289912 |
Median Absolute Deviation (MAD) | 2231148.5 |
Skewness | 4.5349621 |
Sum | 2.7037807 × 109 |
Variance | 8.9042617 × 1014 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
298060 | 1 | 0.4% |
116753 | 1 | 0.4% |
5093562 | 1 | 0.4% |
53736 | 1 | 0.4% |
5389230 | 1 | 0.4% |
1241042 | 1 | 0.4% |
52315673 | 1 | 0.4% |
646184 | 1 | 0.4% |
153015 | 1 | 0.4% |
3680815 | 1 | 0.4% |
Other values (210) | 210 | |
(Missing) | 16 | 6.8% |
Value | Count | Frequency (%) |
5203 | 1 | |
11825 | 1 | |
12581 | 1 | |
18713 | 1 | |
24689 | 1 | |
28611 | 1 | |
30817 | 1 | |
39307 | 1 | |
51833 | 1 | |
53736 | 1 |
Value | Count | Frequency (%) |
238939466 | 1 | |
190739472 | 1 | |
136332243 | 1 | |
133692590 | 1 | |
131565790 | 1 | |
125432319 | 1 | |
103493900 | 1 | |
88122053 | 1 | |
86626547 | 1 | |
58472706 | 1 |
승차 | 하차 | 인키로 | |
---|---|---|---|
승차 | 1.000 | 0.761 | 0.677 |
하차 | 0.761 | 1.000 | 0.941 |
인키로 | 0.677 | 0.941 | 1.000 |
승차 | 하차 | 인키로 | |
---|---|---|---|
승차 | 1.000 | 0.725 | 0.679 |
하차 | 0.725 | 1.000 | 0.969 |
인키로 | 0.679 | 0.969 | 1.000 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
0 | 가평 | <NA> | 3319 | 589437 |
1 | 각계 | <NA> | <NA> | <NA> |
2 | 간석 | 919 | <NA> | <NA> |
3 | 강경 | 18420 | 93462 | 10002172 |
4 | 강구 | 438 | 13037 | 530516 |
5 | 강릉 | 16868 | 38906 | 2978561 |
6 | 개포 | 6429 | 3 | 11825 |
7 | 건천 | 1428 | 3475 | 151638 |
8 | 경산 | 451676 | 569717 | 41841114 |
9 | 경주 | 215773 | 229050 | 14512392 |
역명 | 승차 | 하차 | 인키로 | |
---|---|---|---|---|
226 | 현동 | 900 | 132 | 12581 |
227 | 호계 | 201839 | 162696 | 12182775 |
228 | 홍성 | 56383 | 361900 | 32621121 |
229 | 화명 | 32238 | 32078 | 3270452 |
230 | 화본 | 5447 | 2499 | 234400 |
231 | 화순 | 5299 | 7109 | 720267 |
232 | 황간 | 12332 | 27716 | 2969822 |
233 | 횡천 | 688 | 6455 | 562171 |
234 | 효천 | 3626 | 4467 | 516922 |
235 | 희방사 | 242 | 2680 | 179105 |