Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

DateTime1
Categorical3
Numeric2

Dataset

Description한국철도공사에서 운영중인 경인선의 운행일자별, 전철선별, 전철역별, 시간대별 승하차인원 데이터를 제공합니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15088844/fileData.do

Alerts

전철선 has constant value ""Constant
전철승차인원수 is highly overall correlated with 전철하차인원수High correlation
전철하차인원수 is highly overall correlated with 전철승차인원수High correlation
전철승차인원수 has 194 (1.9%) zerosZeros
전철하차인원수 has 495 (5.0%) zerosZeros

Reproduction

Analysis started2023-12-12 00:03:54.721767
Analysis finished2023-12-12 00:03:55.854122
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct224
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2019-01-01 00:00:00
Maximum2019-08-12 00:00:00
2023-12-12T09:03:55.924402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:03:56.080246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

전철선
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경인선
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경인선
2nd row경인선
3rd row경인선
4th row경인선
5th row경인선

Common Values

ValueCountFrequency (%)
경인선 10000
100.0%

Length

2023-12-12T09:03:56.219589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:03:56.306515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경인선 10000
100.0%

시간대구분
Categorical

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
17-18시간대
 
498
14-15시간대
 
492
15-16시간대
 
490
23-24시간대
 
487
04-05시간대
 
486
Other values (17)
7547 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row00-01시간대
2nd row11-12시간대
3rd row17-18시간대
4th row19-20시간대
5th row15-16시간대

Common Values

ValueCountFrequency (%)
17-18시간대 498
 
5.0%
14-15시간대 492
 
4.9%
15-16시간대 490
 
4.9%
23-24시간대 487
 
4.9%
04-05시간대 486
 
4.9%
06-07시간대 483
 
4.8%
21-22시간대 483
 
4.8%
18-19시간대 480
 
4.8%
11-12시간대 476
 
4.8%
19-20시간대 472
 
4.7%
Other values (12) 5153
51.5%

Length

2023-12-12T09:03:56.418105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
17-18시간대 498
 
5.0%
14-15시간대 492
 
4.9%
15-16시간대 490
 
4.9%
23-24시간대 487
 
4.9%
04-05시간대 486
 
4.9%
06-07시간대 483
 
4.8%
21-22시간대 483
 
4.8%
18-19시간대 480
 
4.8%
11-12시간대 476
 
4.8%
19-20시간대 472
 
4.7%
Other values (12) 5153
51.5%

전철역
Categorical

Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
소사
 
543
부천
 
536
주안
 
532
역곡
 
532
부개
 
519
Other values (15)
7338 

Length

Max length3
Median length2
Mean length2.0954
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부개
2nd row개봉
3rd row간석
4th row송내
5th row

Common Values

ValueCountFrequency (%)
소사 543
 
5.4%
부천 536
 
5.4%
주안 532
 
5.3%
역곡 532
 
5.3%
부개 519
 
5.2%
부평 518
 
5.2%
개봉 513
 
5.1%
온수 508
 
5.1%
505
 
5.1%
오류동 503
 
5.0%
Other values (10) 4791
47.9%

Length

2023-12-12T09:03:56.552186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
소사 543
 
5.4%
부천 536
 
5.4%
주안 532
 
5.3%
역곡 532
 
5.3%
부개 519
 
5.2%
부평 518
 
5.2%
개봉 513
 
5.1%
온수 508
 
5.1%
505
 
5.1%
인천 503
 
5.0%
Other values (10) 4791
47.9%

전철승차인원수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2419
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean761.4383
Minimum0
Maximum6663
Zeros194
Zeros (%)1.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:03:56.701855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8
Q1195
median483
Q31050
95-th percentile2262.1
Maximum6663
Range6663
Interquartile range (IQR)855

Descriptive statistics

Standard deviation874.61284
Coefficient of variation (CV)1.1486326
Kurtosis10.119009
Mean761.4383
Median Absolute Deviation (MAD)350
Skewness2.6453173
Sum7614383
Variance764947.62
MonotonicityNot monotonic
2023-12-12T09:03:56.853625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 194
 
1.9%
1 56
 
0.6%
3 47
 
0.5%
5 39
 
0.4%
7 38
 
0.4%
2 37
 
0.4%
6 35
 
0.4%
8 35
 
0.4%
9 35
 
0.4%
18 31
 
0.3%
Other values (2409) 9453
94.5%
ValueCountFrequency (%)
0 194
1.9%
1 56
 
0.6%
2 37
 
0.4%
3 47
 
0.5%
4 25
 
0.2%
5 39
 
0.4%
6 35
 
0.4%
7 38
 
0.4%
8 35
 
0.4%
9 35
 
0.4%
ValueCountFrequency (%)
6663 1
< 0.1%
6646 1
< 0.1%
6505 1
< 0.1%
6492 1
< 0.1%
6453 1
< 0.1%
6434 1
< 0.1%
6415 1
< 0.1%
6409 1
< 0.1%
6375 1
< 0.1%
6284 1
< 0.1%

전철하차인원수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2396
Distinct (%)24.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean754.4958
Minimum0
Maximum6441
Zeros495
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:03:57.013905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q1199
median451.5
Q3979
95-th percentile2459.15
Maximum6441
Range6441
Interquartile range (IQR)780

Descriptive statistics

Standard deviation883.76266
Coefficient of variation (CV)1.1713288
Kurtosis8.6601133
Mean754.4958
Median Absolute Deviation (MAD)315.5
Skewness2.5571262
Sum7544958
Variance781036.45
MonotonicityNot monotonic
2023-12-12T09:03:57.169880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 495
 
5.0%
1 112
 
1.1%
2 28
 
0.3%
158 26
 
0.3%
192 22
 
0.2%
61 22
 
0.2%
55 21
 
0.2%
397 21
 
0.2%
225 20
 
0.2%
336 20
 
0.2%
Other values (2386) 9213
92.1%
ValueCountFrequency (%)
0 495
5.0%
1 112
 
1.1%
2 28
 
0.3%
3 10
 
0.1%
4 5
 
0.1%
5 4
 
< 0.1%
10 1
 
< 0.1%
11 3
 
< 0.1%
12 2
 
< 0.1%
13 1
 
< 0.1%
ValueCountFrequency (%)
6441 1
< 0.1%
6337 1
< 0.1%
6283 1
< 0.1%
6264 1
< 0.1%
6245 1
< 0.1%
6206 1
< 0.1%
6163 1
< 0.1%
6150 1
< 0.1%
6093 1
< 0.1%
6064 1
< 0.1%

Interactions

2023-12-12T09:03:55.432326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:03:55.120756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:03:55.543670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:03:55.261065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:03:57.286997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간대구분전철역전철승차인원수전철하차인원수
시간대구분1.0000.0520.4890.501
전철역0.0521.0000.6680.657
전철승차인원수0.4890.6681.0000.684
전철하차인원수0.5010.6570.6841.000
2023-12-12T09:03:57.383470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간대구분전철역
시간대구분1.0000.015
전철역0.0151.000
2023-12-12T09:03:57.464680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전철승차인원수전철하차인원수시간대구분전철역
전철승차인원수1.0000.7370.2040.274
전철하차인원수0.7371.0000.2100.267
시간대구분0.2040.2101.0000.015
전철역0.2740.2670.0151.000

Missing values

2023-12-12T09:03:55.682721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:03:55.806848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운행일자전철선시간대구분전철역전철승차인원수전철하차인원수
17172019-01-05경인선00-01시간대부개8115
748112019-06-25경인선11-12시간대개봉1157680
753702019-06-26경인선17-18시간대간석372623
438352019-04-13경인선19-20시간대송내12261993
531242019-05-05경인선15-16시간대475321
873372019-07-24경인선19-20시간대소사2961198
234552019-02-25경인선00-01시간대중동12106
269122019-03-05경인선05-06시간대백운30365
500202019-04-28경인선09-10시간대중동565230
817522019-07-11경인선17-18시간대개봉12712321
운행일자전철선시간대구분전철역전철승차인원수전철하차인원수
161572019-02-07경인선21-22시간대부평21612098
451402019-04-16경인선20-21시간대동인천7521070
925872019-08-06경인선04-05시간대부개740
880572019-07-26경인선12-13시간대동암787601
618702019-05-26경인선04-05시간대온수50
339212019-03-21경인선14-15시간대부천17701788
299462019-03-12경인선08-09시간대12141260
682992019-06-10경인선05-06시간대소사64887
546872019-05-09경인선08-09시간대오류동2423536
688332019-06-11경인선10-11시간대간석384202