Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory585.9 KiB
Average record size in memory60.0 B

Variable types

Numeric3
Categorical2
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12914/S/1/datasetView.do

Alerts

등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation

Reproduction

Analysis started2024-05-11 06:21:21.879126
Analysis finished2024-05-11 06:21:26.032469
Duration4.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Real number (ℝ)

Distinct184
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20150375
Minimum20150101
Maximum20150703
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:26.167149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20150101
5-th percentile20150110
Q120150216
median20150404
Q320150519
95-th percentile20150625
Maximum20150703
Range602
Interquartile range (IQR)303

Descriptive statistics

Standard deviation174.62552
Coefficient of variation (CV)8.6661177 × 10-6
Kurtosis-1.2188809
Mean20150375
Median Absolute Deviation (MAD)127
Skewness-0.021361828
Sum2.0150375 × 1011
Variance30494.071
MonotonicityNot monotonic
2024-05-11T15:21:26.459286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20150629 78
 
0.8%
20150404 76
 
0.8%
20150203 75
 
0.8%
20150613 74
 
0.7%
20150117 72
 
0.7%
20150429 69
 
0.7%
20150417 68
 
0.7%
20150521 68
 
0.7%
20150410 67
 
0.7%
20150408 67
 
0.7%
Other values (174) 9286
92.9%
ValueCountFrequency (%)
20150101 48
0.5%
20150102 52
0.5%
20150103 52
0.5%
20150104 51
0.5%
20150105 55
0.5%
20150106 56
0.6%
20150107 42
0.4%
20150108 51
0.5%
20150109 58
0.6%
20150110 56
0.6%
ValueCountFrequency (%)
20150703 34
0.3%
20150702 53
0.5%
20150701 66
0.7%
20150630 57
0.6%
20150629 78
0.8%
20150628 55
0.5%
20150627 59
0.6%
20150626 59
0.6%
20150625 58
0.6%
20150624 54
0.5%

노선명
Categorical

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
7호선
961 
2호선
927 
5호선
911 
6호선
700 
경부선
689 
Other values (18)
5812 

Length

Max length8
Median length3
Mean length3.1184
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중앙선
2nd row2호선
3rd row경춘선
4th row4호선
5th row경원선

Common Values

ValueCountFrequency (%)
7호선 961
 
9.6%
2호선 927
 
9.3%
5호선 911
 
9.1%
6호선 700
 
7.0%
경부선 689
 
6.9%
분당선 650
 
6.5%
3호선 586
 
5.9%
경원선 522
 
5.2%
4호선 474
 
4.7%
9호선 430
 
4.3%
Other values (13) 3150
31.5%

Length

2024-05-11T15:21:26.754465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7호선 961
 
9.4%
2호선 927
 
9.1%
5호선 911
 
8.9%
6호선 700
 
6.9%
경부선 689
 
6.7%
분당선 650
 
6.4%
3호선 586
 
5.7%
경원선 522
 
5.1%
4호선 474
 
4.6%
9호선 430
 
4.2%
Other values (13) 3358
32.9%

역명
Text

Distinct478
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:21:27.338911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.8403
Min length2

Characters and Unicode

Total characters28403
Distinct characters268
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row양평
2nd row건대입구
3rd row대성리
4th row미아
5th row동두천
ValueCountFrequency (%)
공덕 79
 
0.8%
서울역 74
 
0.7%
고속터미널 56
 
0.6%
동대문역사문화공원 56
 
0.6%
디지털미디어시티 53
 
0.5%
홍대입구 53
 
0.5%
김포공항 51
 
0.5%
신당 50
 
0.5%
종로3가 50
 
0.5%
약수 49
 
0.5%
Other values (467) 9446
94.3%
2024-05-11T15:21:28.269438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
886
 
3.1%
873
 
3.1%
746
 
2.6%
734
 
2.6%
688
 
2.4%
615
 
2.2%
550
 
1.9%
520
 
1.8%
501
 
1.8%
477
 
1.7%
Other values (258) 21813
76.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27942
98.4%
Decimal Number 150
 
0.5%
Open Punctuation 147
 
0.5%
Close Punctuation 147
 
0.5%
Space Separator 17
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
886
 
3.2%
873
 
3.1%
746
 
2.7%
734
 
2.6%
688
 
2.5%
615
 
2.2%
550
 
2.0%
520
 
1.9%
501
 
1.8%
477
 
1.7%
Other values (252) 21352
76.4%
Decimal Number
ValueCountFrequency (%)
3 81
54.0%
4 46
30.7%
5 23
 
15.3%
Open Punctuation
ValueCountFrequency (%)
( 147
100.0%
Close Punctuation
ValueCountFrequency (%)
) 147
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27942
98.4%
Common 461
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
886
 
3.2%
873
 
3.1%
746
 
2.7%
734
 
2.6%
688
 
2.5%
615
 
2.2%
550
 
2.0%
520
 
1.9%
501
 
1.8%
477
 
1.7%
Other values (252) 21352
76.4%
Common
ValueCountFrequency (%)
( 147
31.9%
) 147
31.9%
3 81
17.6%
4 46
 
10.0%
5 23
 
5.0%
17
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27942
98.4%
ASCII 461
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
886
 
3.2%
873
 
3.1%
746
 
2.7%
734
 
2.6%
688
 
2.5%
615
 
2.2%
550
 
2.0%
520
 
1.9%
501
 
1.8%
477
 
1.7%
Other values (252) 21352
76.4%
ASCII
ValueCountFrequency (%)
( 147
31.9%
) 147
31.9%
3 81
17.6%
4 46
 
10.0%
5 23
 
5.0%
17
 
3.7%

승차총승객수
Real number (ℝ)

HIGH CORRELATION 

Distinct8323
Distinct (%)83.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13270.997
Minimum1
Maximum142712
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:28.493946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1014.75
Q14338.75
median9110
Q316962.5
95-th percentile39842.1
Maximum142712
Range142711
Interquartile range (IQR)12623.75

Descriptive statistics

Standard deviation13794.959
Coefficient of variation (CV)1.0394818
Kurtosis9.3258378
Mean13270.997
Median Absolute Deviation (MAD)5573.5
Skewness2.5085234
Sum1.3270997 × 108
Variance1.903009 × 108
MonotonicityNot monotonic
2024-05-11T15:21:28.751513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 21
 
0.2%
2 16
 
0.2%
8007 5
 
0.1%
3703 5
 
0.1%
8018 4
 
< 0.1%
6503 4
 
< 0.1%
2873 4
 
< 0.1%
8488 4
 
< 0.1%
2147 4
 
< 0.1%
1584 4
 
< 0.1%
Other values (8313) 9929
99.3%
ValueCountFrequency (%)
1 21
0.2%
2 16
0.2%
3 2
 
< 0.1%
4 2
 
< 0.1%
14 1
 
< 0.1%
23 2
 
< 0.1%
29 2
 
< 0.1%
37 1
 
< 0.1%
42 2
 
< 0.1%
44 1
 
< 0.1%
ValueCountFrequency (%)
142712 1
< 0.1%
134866 1
< 0.1%
126439 1
< 0.1%
122909 1
< 0.1%
119070 1
< 0.1%
117339 1
< 0.1%
112390 1
< 0.1%
110603 1
< 0.1%
110024 1
< 0.1%
106009 1
< 0.1%

하차총승객수
Real number (ℝ)

HIGH CORRELATION 

Distinct8289
Distinct (%)82.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13239.073
Minimum0
Maximum150248
Zeros41
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:29.085071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile980.8
Q14211.5
median8851.5
Q316830.75
95-th percentile41933.75
Maximum150248
Range150248
Interquartile range (IQR)12619.25

Descriptive statistics

Standard deviation14016.682
Coefficient of variation (CV)1.0587359
Kurtosis9.5964128
Mean13239.073
Median Absolute Deviation (MAD)5476.5
Skewness2.5284342
Sum1.3239073 × 108
Variance1.9646738 × 108
MonotonicityNot monotonic
2024-05-11T15:21:29.341285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 41
 
0.4%
4206 6
 
0.1%
2773 5
 
0.1%
932 5
 
0.1%
5745 5
 
0.1%
3442 4
 
< 0.1%
6104 4
 
< 0.1%
2309 4
 
< 0.1%
8051 4
 
< 0.1%
2189 4
 
< 0.1%
Other values (8279) 9918
99.2%
ValueCountFrequency (%)
0 41
0.4%
14 1
 
< 0.1%
19 1
 
< 0.1%
21 1
 
< 0.1%
22 1
 
< 0.1%
25 1
 
< 0.1%
30 1
 
< 0.1%
32 2
 
< 0.1%
35 1
 
< 0.1%
36 1
 
< 0.1%
ValueCountFrequency (%)
150248 1
< 0.1%
135565 1
< 0.1%
130100 1
< 0.1%
127996 1
< 0.1%
126674 1
< 0.1%
123611 1
< 0.1%
117022 1
< 0.1%
114785 1
< 0.1%
113516 1
< 0.1%
113155 1
< 0.1%

등록일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20151217
10000 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20151217
2nd row20151217
3rd row20151217
4th row20151217
5th row20151217

Common Values

ValueCountFrequency (%)
20151217 10000
100.0%

Length

2024-05-11T15:21:29.582165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:21:29.727496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20151217 10000
100.0%

Interactions

2024-05-11T15:21:24.851927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:22.793561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:24.061179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:25.135175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:23.083054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:24.322789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:25.407542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:23.822387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:24.562545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:21:29.825797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용일자노선명승차총승객수하차총승객수
사용일자1.0000.0000.0410.069
노선명0.0001.0000.4900.484
승차총승객수0.0410.4901.0000.984
하차총승객수0.0690.4840.9841.000
2024-05-11T15:21:30.015045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용일자승차총승객수하차총승객수노선명
사용일자1.0000.0220.0220.000
승차총승객수0.0221.0000.9940.204
하차총승객수0.0220.9941.0000.201
노선명0.0000.2040.2011.000

Missing values

2024-05-11T15:21:25.715389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:21:25.947204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사용일자노선명역명승차총승객수하차총승객수등록일자
7425220150517중앙선양평3975348320151217
25413201502162호선건대입구457334867620151217
7108720150511경춘선대성리61550920151217
51613201504064호선미아236442174620151217
1387020150126경원선동두천2093271420151217
1931320150205분당선매교2908282820151217
32162201503015호선명일6543648420151217
7391320150516경의선서울역2714245420151217
64908201504306호선삼각지8489713820151217
7848420150525경부선남영6551707420151217
사용일자노선명역명승차총승객수하차총승객수등록일자
2303520150212중앙선덕소6819694920151217
92571201506205호선마포8210815620151217
65005201504307호선삼산체육관6712589420151217
96675201506273호선연신내368713404320151217
5789420150417경인선백운114171123420151217
93476201506212호선역삼123761315520151217
10447201501202호선신당166061744120151217
31107201502279호선개화7628499620151217
89922201506152호선시청276152796120151217
20806201502086호선보문4763447520151217