Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

DateTime1
Text1
Categorical2
Numeric2

Dataset

Description2016년~2020년 한국철도공사의 정차역별, 역무열차종별, 상하행구분별 승하차인원수 데이터를 제공합니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15088875/fileData.do

Alerts

승차인원수 has 1144 (11.4%) zerosZeros
하차인원수 has 1088 (10.9%) zerosZeros

Reproduction

Analysis started2023-12-12 05:19:42.195701
Analysis finished2023-12-12 05:19:43.211358
Duration1.02 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct107
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2016-01-01 00:00:00
Maximum2016-04-16 00:00:00
2023-12-12T14:19:43.291507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:19:43.457479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct244
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T14:19:43.831126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.3286
Min length2

Characters and Unicode

Total characters23286
Distinct characters181
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st row진영
2nd row구례구
3rd row경주
4th row광주송정
5th row천안
ValueCountFrequency (%)
용산 153
 
1.5%
순천 152
 
1.5%
대전 143
 
1.4%
서울 142
 
1.4%
전주 137
 
1.4%
남원 136
 
1.4%
익산 133
 
1.3%
영등포 130
 
1.3%
창원중앙 129
 
1.3%
나주 126
 
1.3%
Other values (234) 8619
86.2%
2023-12-12T14:19:44.331120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1246
 
5.4%
1191
 
5.1%
1071
 
4.6%
715
 
3.1%
687
 
3.0%
551
 
2.4%
517
 
2.2%
515
 
2.2%
503
 
2.2%
464
 
2.0%
Other values (171) 15826
68.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23146
99.4%
Uppercase Letter 70
 
0.3%
Decimal Number 70
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1246
 
5.4%
1191
 
5.1%
1071
 
4.6%
715
 
3.1%
687
 
3.0%
551
 
2.4%
517
 
2.2%
515
 
2.2%
503
 
2.2%
464
 
2.0%
Other values (169) 15686
67.8%
Uppercase Letter
ValueCountFrequency (%)
T 70
100.0%
Decimal Number
ValueCountFrequency (%)
1 70
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23146
99.4%
Latin 70
 
0.3%
Common 70
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1246
 
5.4%
1191
 
5.1%
1071
 
4.6%
715
 
3.1%
687
 
3.0%
551
 
2.4%
517
 
2.2%
515
 
2.2%
503
 
2.2%
464
 
2.0%
Other values (169) 15686
67.8%
Latin
ValueCountFrequency (%)
T 70
100.0%
Common
ValueCountFrequency (%)
1 70
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23146
99.4%
ASCII 140
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1246
 
5.4%
1191
 
5.1%
1071
 
4.6%
715
 
3.1%
687
 
3.0%
551
 
2.4%
517
 
2.2%
515
 
2.2%
503
 
2.2%
464
 
2.0%
Other values (169) 15686
67.8%
ASCII
ValueCountFrequency (%)
T 70
50.0%
1 70
50.0%

역무열차종
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
무궁화호
4532 
ITX-새마을
1220 
새마을호
1110 
KTX-산천
864 
KTX
823 
Other values (3)
1451 

Length

Max length7
Median length4
Mean length4.5277
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKTX-호남
2nd rowKTX-산천
3rd row무궁화호
4th rowKTX
5th row새마을호

Common Values

ValueCountFrequency (%)
무궁화호 4532
45.3%
ITX-새마을 1220
 
12.2%
새마을호 1110
 
11.1%
KTX-산천 864
 
8.6%
KTX 823
 
8.2%
KTX-호남 648
 
6.5%
누리로 584
 
5.8%
통근열차 219
 
2.2%

Length

2023-12-12T14:19:44.512120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:19:44.668604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무궁화호 4532
45.3%
itx-새마을 1220
 
12.2%
새마을호 1110
 
11.1%
ktx-산천 864
 
8.6%
ktx 823
 
8.2%
ktx-호남 648
 
6.5%
누리로 584
 
5.8%
통근열차 219
 
2.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
하행
5004 
상행
4996 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상행
2nd row상행
3rd row하행
4th row하행
5th row상행

Common Values

ValueCountFrequency (%)
하행 5004
50.0%
상행 4996
50.0%

Length

2023-12-12T14:19:44.873461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:19:44.998522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하행 5004
50.0%
상행 4996
50.0%

승차인원수
Real number (ℝ)

ZEROS 

Distinct1608
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean424.8136
Minimum0
Maximum42705
Zeros1144
Zeros (%)11.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:19:45.118853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median51
Q3249
95-th percentile1816.05
Maximum42705
Range42705
Interquartile range (IQR)242

Descriptive statistics

Standard deviation1598.9534
Coefficient of variation (CV)3.7638941
Kurtosis237.20159
Mean424.8136
Median Absolute Deviation (MAD)51
Skewness12.606692
Sum4248136
Variance2556652
MonotonicityNot monotonic
2023-12-12T14:19:45.259647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1144
 
11.4%
1 312
 
3.1%
2 264
 
2.6%
3 213
 
2.1%
4 203
 
2.0%
5 158
 
1.6%
8 136
 
1.4%
7 124
 
1.2%
6 120
 
1.2%
9 109
 
1.1%
Other values (1598) 7217
72.2%
ValueCountFrequency (%)
0 1144
11.4%
1 312
 
3.1%
2 264
 
2.6%
3 213
 
2.1%
4 203
 
2.0%
5 158
 
1.6%
6 120
 
1.2%
7 124
 
1.2%
8 136
 
1.4%
9 109
 
1.1%
ValueCountFrequency (%)
42705 1
< 0.1%
40494 1
< 0.1%
37370 1
< 0.1%
36613 1
< 0.1%
34175 1
< 0.1%
31797 1
< 0.1%
31709 1
< 0.1%
30474 1
< 0.1%
26695 1
< 0.1%
24988 1
< 0.1%

하차인원수
Real number (ℝ)

ZEROS 

Distinct1601
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean401.1301
Minimum0
Maximum38863
Zeros1088
Zeros (%)10.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:19:45.407853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median49
Q3245
95-th percentile1742.1
Maximum38863
Range38863
Interquartile range (IQR)239

Descriptive statistics

Standard deviation1508.6856
Coefficient of variation (CV)3.7610879
Kurtosis269.88973
Mean401.1301
Median Absolute Deviation (MAD)48
Skewness13.554036
Sum4011301
Variance2276132.1
MonotonicityNot monotonic
2023-12-12T14:19:45.589318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1088
 
10.9%
1 395
 
4.0%
2 304
 
3.0%
3 226
 
2.3%
4 193
 
1.9%
5 174
 
1.7%
6 141
 
1.4%
7 138
 
1.4%
8 125
 
1.2%
9 117
 
1.2%
Other values (1591) 7099
71.0%
ValueCountFrequency (%)
0 1088
10.9%
1 395
 
4.0%
2 304
 
3.0%
3 226
 
2.3%
4 193
 
1.9%
5 174
 
1.7%
6 141
 
1.4%
7 138
 
1.4%
8 125
 
1.2%
9 117
 
1.2%
ValueCountFrequency (%)
38863 1
< 0.1%
37651 1
< 0.1%
36324 1
< 0.1%
35262 1
< 0.1%
34730 1
< 0.1%
33947 1
< 0.1%
33927 1
< 0.1%
32596 1
< 0.1%
29813 1
< 0.1%
22508 1
< 0.1%

Interactions

2023-12-12T14:19:42.804497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:19:42.608924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:19:42.902261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:19:42.692019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:19:45.713988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역무열차종상행하행구분승차인원수하차인원수
역무열차종1.0000.0000.1790.159
상행하행구분0.0001.0000.0580.040
승차인원수0.1790.0581.0000.222
하차인원수0.1590.0400.2221.000
2023-12-12T14:19:45.828853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상행하행구분역무열차종
상행하행구분1.0000.000
역무열차종0.0001.000
2023-12-12T14:19:45.927736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승차인원수하차인원수역무열차종상행하행구분
승차인원수1.0000.2690.0860.044
하차인원수0.2691.0000.0780.040
역무열차종0.0860.0781.0000.000
상행하행구분0.0440.0400.0001.000

Missing values

2023-12-12T14:19:43.039515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:19:43.154277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운행일자정차역역무열차종상행하행구분승차인원수하차인원수
587432016-03-06진영KTX-호남상행631
460212016-02-21구례구KTX-산천상행260
737562016-03-23경주무궁화호하행508502
468202016-02-22광주송정KTX하행1001989
758432016-03-26천안새마을호상행379345
513102016-02-27전주무궁화호상행17571016
820542016-04-01검암KTX-호남상행032
887852016-04-09익산KTX-호남상행783510
438502016-02-19천안무궁화호하행41033376
840362016-04-04천안ITX-새마을상행451441
운행일자정차역역무열차종상행하행구분승차인원수하차인원수
236022016-01-27민둥산무궁화호상행3915
422492016-02-17단양ITX-새마을상행141
945962016-04-15안강무궁화호하행038
170902016-01-20정읍무궁화호상행39568
616182016-03-10청도ITX-새마을상행179
61372016-01-07울산KTX-산천하행88408
413632016-02-16단양무궁화호하행75142
506772016-02-26동백산무궁화호하행733
195042016-01-23신탄진무궁화호상행1589544
618012016-03-10광주송정KTX상행1806125