Overview

Dataset statistics

Number of variables6
Number of observations426
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.8 KiB
Average record size in memory52.3 B

Variable types

Categorical1
Text1
Numeric4

Dataset

Description매년 한국철도공사에서 발행하는 철도통계연보에 수록된 수도권전털 여객발착 정보로 선로명,역명,승차,강차,승차,강차,단위 항목을 지원합니다.
URLhttps://www.data.go.kr/data/3037646/fileData.do

Alerts

하행승차 is highly overall correlated with 상행강차High correlation
하행강차 is highly overall correlated with 상행승차High correlation
상행승차 is highly overall correlated with 하행강차High correlation
상행강차 is highly overall correlated with 하행승차High correlation
하행승차 has 32 (7.5%) zerosZeros
하행강차 has 42 (9.9%) zerosZeros
상행승차 has 42 (9.9%) zerosZeros
상행강차 has 33 (7.7%) zerosZeros

Reproduction

Analysis started2023-12-12 01:06:37.355427
Analysis finished2023-12-12 01:06:39.499609
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

선로명
Categorical

Distinct15
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
경원, 중앙선
77 
경부선
61 
분당선
54 
경원선
32 
수인선
31 
Other values (10)
171 

Length

Max length7
Median length3
Mean length4.0704225
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경원선
2nd row경원선
3rd row경원선
4th row경원선
5th row경원선

Common Values

ValueCountFrequency (%)
경원, 중앙선 77
18.1%
경부선 61
14.3%
분당선 54
12.7%
경원선 32
7.5%
수인선 31
7.3%
과천, 안산선 30
 
7.0%
경춘선 30
 
7.0%
경인선 28
 
6.6%
동해선 23
 
5.4%
경강선 15
 
3.5%
Other values (5) 45
10.6%

Length

2023-12-12T10:06:39.592748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경원 77
14.4%
중앙선 77
14.4%
경부선 61
11.4%
분당선 54
10.1%
경원선 32
 
6.0%
수인선 31
 
5.8%
안산선 30
 
5.6%
경춘선 30
 
5.6%
과천 30
 
5.6%
경인선 28
 
5.3%
Other values (7) 83
15.6%

역명
Text

Distinct341
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
2023-12-12T10:06:40.031873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length2
Mean length3.213615
Min length2

Characters and Unicode

Total characters1369
Distinct characters228
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique268 ?
Unique (%)62.9%

Sample

1st row제기동
2nd row청량리(서울시립대입구)
3rd row청량리
4th row중랑
5th row회기
ValueCountFrequency (%)
청량리 4
 
0.9%
회기 4
 
0.9%
수원(분당 3
 
0.7%
초지 3
 
0.7%
외대앞 3
 
0.7%
중랑 3
 
0.7%
왕십리 3
 
0.7%
초지(소시 3
 
0.7%
고색 3
 
0.7%
청량리(서울시립대입구 3
 
0.7%
Other values (333) 396
92.5%
2023-12-12T10:06:40.671965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 72
 
5.3%
( 72
 
5.3%
35
 
2.6%
33
 
2.4%
33
 
2.4%
28
 
2.0%
27
 
2.0%
24
 
1.8%
24
 
1.8%
24
 
1.8%
Other values (218) 997
72.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1188
86.8%
Close Punctuation 72
 
5.3%
Open Punctuation 72
 
5.3%
Decimal Number 35
 
2.6%
Space Separator 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
2.9%
33
 
2.8%
33
 
2.8%
28
 
2.4%
27
 
2.3%
24
 
2.0%
24
 
2.0%
24
 
2.0%
23
 
1.9%
22
 
1.9%
Other values (206) 915
77.0%
Decimal Number
ValueCountFrequency (%)
7 6
17.1%
2 6
17.1%
6 6
17.1%
3 4
11.4%
5 4
11.4%
4 4
11.4%
8 2
 
5.7%
9 2
 
5.7%
1 1
 
2.9%
Close Punctuation
ValueCountFrequency (%)
) 72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 72
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1188
86.8%
Common 181
 
13.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
2.9%
33
 
2.8%
33
 
2.8%
28
 
2.4%
27
 
2.3%
24
 
2.0%
24
 
2.0%
24
 
2.0%
23
 
1.9%
22
 
1.9%
Other values (206) 915
77.0%
Common
ValueCountFrequency (%)
) 72
39.8%
( 72
39.8%
7 6
 
3.3%
2 6
 
3.3%
6 6
 
3.3%
3 4
 
2.2%
5 4
 
2.2%
4 4
 
2.2%
8 2
 
1.1%
2
 
1.1%
Other values (2) 3
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1188
86.8%
ASCII 181
 
13.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 72
39.8%
( 72
39.8%
7 6
 
3.3%
2 6
 
3.3%
6 6
 
3.3%
3 4
 
2.2%
5 4
 
2.2%
4 4
 
2.2%
8 2
 
1.1%
2
 
1.1%
Other values (2) 3
 
1.7%
Hangul
ValueCountFrequency (%)
35
 
2.9%
33
 
2.8%
33
 
2.8%
28
 
2.4%
27
 
2.3%
24
 
2.0%
24
 
2.0%
24
 
2.0%
23
 
1.9%
22
 
1.9%
Other values (206) 915
77.0%

하행승차
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct380
Distinct (%)89.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1779345.5
Minimum0
Maximum51189401
Zeros32
Zeros (%)7.5%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T10:06:40.886197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1148258
median578414
Q31563154
95-th percentile6839871.8
Maximum51189401
Range51189401
Interquartile range (IQR)1414896

Descriptive statistics

Standard deviation4523701.6
Coefficient of variation (CV)2.5423402
Kurtosis50.906127
Mean1779345.5
Median Absolute Deviation (MAD)526697
Skewness6.3988914
Sum7.5800119 × 108
Variance2.0463876 × 1013
MonotonicityNot monotonic
2023-12-12T10:06:41.091538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 32
 
7.5%
3465639 2
 
0.5%
653349 2
 
0.5%
110056 2
 
0.5%
718059 2
 
0.5%
236723 2
 
0.5%
1289786 2
 
0.5%
718648 2
 
0.5%
1632133 2
 
0.5%
951628 2
 
0.5%
Other values (370) 376
88.3%
ValueCountFrequency (%)
0 32
7.5%
56 1
 
0.2%
76 1
 
0.2%
123 1
 
0.2%
190 1
 
0.2%
1983 1
 
0.2%
2594 1
 
0.2%
2638 1
 
0.2%
6261 1
 
0.2%
7902 1
 
0.2%
ValueCountFrequency (%)
51189401 1
0.2%
33120273 1
0.2%
31573114 1
0.2%
30720609 1
0.2%
28168585 1
0.2%
26579650 1
0.2%
18029422 1
0.2%
15459515 1
0.2%
13634920 1
0.2%
12746665 1
0.2%

하행강차
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct370
Distinct (%)86.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1779345.5
Minimum0
Maximum51189401
Zeros42
Zeros (%)9.9%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T10:06:41.269908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1286646.75
median926332.5
Q32223076
95-th percentile5360814.2
Maximum51189401
Range51189401
Interquartile range (IQR)1936429.2

Descriptive statistics

Standard deviation3222960.8
Coefficient of variation (CV)1.8113181
Kurtosis131.90462
Mean1779345.5
Median Absolute Deviation (MAD)815666.5
Skewness9.3741154
Sum7.5800119 × 108
Variance1.0387476 × 1013
MonotonicityNot monotonic
2023-12-12T10:06:41.452874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 42
 
9.9%
1711537 2
 
0.5%
2060952 2
 
0.5%
2428792 2
 
0.5%
461346 2
 
0.5%
2986377 2
 
0.5%
1137514 2
 
0.5%
1735796 2
 
0.5%
2057566 2
 
0.5%
4566730 2
 
0.5%
Other values (360) 366
85.9%
ValueCountFrequency (%)
0 42
9.9%
9 1
 
0.2%
832 1
 
0.2%
4208 1
 
0.2%
10058 1
 
0.2%
10680 1
 
0.2%
10965 1
 
0.2%
13279 1
 
0.2%
24719 1
 
0.2%
24755 1
 
0.2%
ValueCountFrequency (%)
51189401 1
0.2%
15459515 2
0.5%
14598233 1
0.2%
13239561 1
0.2%
10353684 1
0.2%
10013922 1
0.2%
9713748 1
0.2%
8496409 1
0.2%
7908283 1
0.2%
7489210 1
0.2%

상행승차
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct370
Distinct (%)86.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1796267.5
Minimum0
Maximum53007652
Zeros42
Zeros (%)9.9%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T10:06:41.604901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1292186
median1000777
Q32240116.2
95-th percentile5355795.5
Maximum53007652
Range53007652
Interquartile range (IQR)1947930.2

Descriptive statistics

Standard deviation3290382.4
Coefficient of variation (CV)1.8317886
Kurtosis140.22536
Mean1796267.5
Median Absolute Deviation (MAD)853891
Skewness9.7610278
Sum7.6520996 × 108
Variance1.0826616 × 1013
MonotonicityNot monotonic
2023-12-12T10:06:41.777157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 42
 
9.9%
1778170 2
 
0.5%
2106612 2
 
0.5%
2373409 2
 
0.5%
499729 2
 
0.5%
3305661 2
 
0.5%
1163120 2
 
0.5%
1736650 2
 
0.5%
2210571 2
 
0.5%
4443772 2
 
0.5%
Other values (360) 366
85.9%
ValueCountFrequency (%)
0 42
9.9%
10 1
 
0.2%
81 1
 
0.2%
87 1
 
0.2%
1588 1
 
0.2%
6365 1
 
0.2%
8017 1
 
0.2%
10055 1
 
0.2%
17361 1
 
0.2%
17922 1
 
0.2%
ValueCountFrequency (%)
53007652 1
0.2%
16508835 2
0.5%
14262869 1
0.2%
13177830 1
0.2%
10766703 1
0.2%
9198064 1
0.2%
9004141 1
0.2%
8671289 1
0.2%
8047328 1
0.2%
7416882 1
0.2%

상행강차
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct379
Distinct (%)89.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1796267.5
Minimum0
Maximum53007652
Zeros33
Zeros (%)7.7%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2023-12-12T10:06:41.957705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1151682.25
median570741.5
Q31594457.5
95-th percentile7009880.5
Maximum53007652
Range53007652
Interquartile range (IQR)1442775.2

Descriptive statistics

Standard deviation4621733.3
Coefficient of variation (CV)2.5729649
Kurtosis52.509089
Mean1796267.5
Median Absolute Deviation (MAD)520118
Skewness6.4888836
Sum7.6520996 × 108
Variance2.1360419 × 1013
MonotonicityNot monotonic
2023-12-12T10:06:42.171520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 33
 
7.7%
2980735 2
 
0.5%
695381 2
 
0.5%
102317 2
 
0.5%
768644 2
 
0.5%
262667 2
 
0.5%
1303715 2
 
0.5%
749907 2
 
0.5%
1797265 2
 
0.5%
1016683 2
 
0.5%
Other values (369) 375
88.0%
ValueCountFrequency (%)
0 33
7.7%
137 1
 
0.2%
1966 1
 
0.2%
2461 1
 
0.2%
2680 1
 
0.2%
6600 1
 
0.2%
8017 1
 
0.2%
8103 1
 
0.2%
8870 1
 
0.2%
9628 1
 
0.2%
ValueCountFrequency (%)
53007652 1
0.2%
33766010 1
0.2%
31573114 1
0.2%
30720609 1
0.2%
29374681 1
0.2%
27395470 1
0.2%
18013096 1
0.2%
16508835 1
0.2%
14106679 1
0.2%
12781217 1
0.2%

Interactions

2023-12-12T10:06:38.864234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:37.669405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.107472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.486790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.957883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:37.774519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.199804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.588830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:39.053286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:37.886019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.290766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.675379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:39.169718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:37.998322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.394746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:06:38.768190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:06:42.298220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선로명하행승차하행강차상행승차상행강차
선로명1.0000.0000.0000.1970.000
하행승차0.0001.0000.0000.0000.995
하행강차0.0000.0001.0000.9980.000
상행승차0.1970.0000.9981.0000.000
상행강차0.0000.9950.0000.0001.000
2023-12-12T10:06:42.452155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
하행승차하행강차상행승차상행강차선로명
하행승차1.0000.4130.3950.9810.000
하행강차0.4131.0000.9960.4300.000
상행승차0.3950.9961.0000.4160.083
상행강차0.9810.4300.4161.0000.000
선로명0.0000.0000.0830.0001.000

Missing values

2023-12-12T10:06:39.319451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:06:39.447624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

선로명역명하행승차하행강차상행승차상행강차
0경원선제기동265796500027395470
1경원선청량리(서울시립대입구)2838992002557812
2경원선청량리154595152421972223905816508835
3경원선중랑72133884964098671289823558
4경원선회기1308236286789529637241329010
5경원선외대앞59456121989692167898545159
6경원선신이문49998620306922233975491184
7경원선석계(6)1491471495392253719661440515
8경원선석계66132122991401998038644606
9경원선광운대69090319432641956110656700
선로명역명하행승차하행강차상행승차상행강차
416동해선기장230003941785987093244195
417동해선일광9434179189582085498553
418동해선좌천6096726733028995859476
419동해선월내2090216442616215222710
420동해선서생2908011685511963427481
421동해선남창8692228729427536879398
422동해선망양16521498155248715154
423동해선덕하1070912875212583910266
424동해선개운포259452447531832680
425동해선태화강0129833012971760