Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory585.9 KiB
Average record size in memory60.0 B

Variable types

Numeric4
Categorical1
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12914/S/1/datasetView.do

Alerts

사용일자 is highly overall correlated with 등록일자High correlation
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
등록일자 is highly overall correlated with 사용일자High correlation

Reproduction

Analysis started2024-05-11 06:21:08.379465
Analysis finished2024-05-11 06:21:13.942007
Duration5.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Real number (ℝ)

HIGH CORRELATION 

Distinct182
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20160365
Minimum20160101
Maximum20160630
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:14.061428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20160101
5-th percentile20160110
Q120160214
median20160401
Q320160514
95-th percentile20160621
Maximum20160630
Range529
Interquartile range (IQR)300

Descriptive statistics

Standard deviation170.12512
Coefficient of variation (CV)8.4385934 × 10-6
Kurtosis-1.248385
Mean20160365
Median Absolute Deviation (MAD)128
Skewness-0.012112141
Sum2.0160365 × 1011
Variance28942.558
MonotonicityNot monotonic
2024-05-11T15:21:14.300681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20160122 81
 
0.8%
20160330 72
 
0.7%
20160529 70
 
0.7%
20160117 68
 
0.7%
20160423 68
 
0.7%
20160528 67
 
0.7%
20160302 67
 
0.7%
20160426 67
 
0.7%
20160513 67
 
0.7%
20160610 66
 
0.7%
Other values (172) 9307
93.1%
ValueCountFrequency (%)
20160101 49
0.5%
20160102 41
0.4%
20160103 61
0.6%
20160104 60
0.6%
20160105 50
0.5%
20160106 55
0.5%
20160107 64
0.6%
20160108 61
0.6%
20160109 45
0.4%
20160110 50
0.5%
ValueCountFrequency (%)
20160630 17
 
0.2%
20160629 60
0.6%
20160628 55
0.5%
20160627 57
0.6%
20160626 63
0.6%
20160625 55
0.5%
20160624 46
0.5%
20160623 56
0.6%
20160622 60
0.6%
20160621 60
0.6%

노선명
Categorical

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5호선
967 
7호선
933 
2호선
907 
경부선
701 
6호선
659 
Other values (18)
5833 

Length

Max length8
Median length3
Mean length3.1361
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9호선2단계
2nd row7호선
3rd row5호선
4th row분당선
5th row분당선

Common Values

ValueCountFrequency (%)
5호선 967
 
9.7%
7호선 933
 
9.3%
2호선 907
 
9.1%
경부선 701
 
7.0%
6호선 659
 
6.6%
분당선 609
 
6.1%
3호선 586
 
5.9%
경원선 512
 
5.1%
9호선 458
 
4.6%
경의선 456
 
4.6%
Other values (13) 3212
32.1%

Length

2024-05-11T15:21:14.540275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5호선 967
 
9.5%
7호선 933
 
9.1%
2호선 907
 
8.9%
경부선 701
 
6.9%
6호선 659
 
6.5%
분당선 609
 
6.0%
3호선 586
 
5.7%
경원선 512
 
5.0%
9호선 458
 
4.5%
경의선 456
 
4.5%
Other values (13) 3423
33.5%

역명
Text

Distinct483
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:21:15.136303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.8403
Min length2

Characters and Unicode

Total characters28403
Distinct characters268
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row언주
2nd row하계
3rd row송정
4th row매교
5th row서현
ValueCountFrequency (%)
서울역 88
 
0.9%
공덕 69
 
0.7%
종로3가 67
 
0.7%
동대문역사문화공원 55
 
0.5%
홍대입구 55
 
0.5%
김포공항 54
 
0.5%
디지털미디어시티 54
 
0.5%
고속터미널 53
 
0.5%
청구 51
 
0.5%
여의도 46
 
0.5%
Other values (472) 9421
94.1%
2024-05-11T15:21:16.097473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
926
 
3.3%
908
 
3.2%
807
 
2.8%
707
 
2.5%
702
 
2.5%
615
 
2.2%
544
 
1.9%
538
 
1.9%
500
 
1.8%
441
 
1.6%
Other values (258) 21715
76.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27936
98.4%
Decimal Number 164
 
0.6%
Close Punctuation 145
 
0.5%
Open Punctuation 145
 
0.5%
Space Separator 13
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
926
 
3.3%
908
 
3.3%
807
 
2.9%
707
 
2.5%
702
 
2.5%
615
 
2.2%
544
 
1.9%
538
 
1.9%
500
 
1.8%
441
 
1.6%
Other values (252) 21248
76.1%
Decimal Number
ValueCountFrequency (%)
3 110
67.1%
4 35
 
21.3%
5 19
 
11.6%
Close Punctuation
ValueCountFrequency (%)
) 145
100.0%
Open Punctuation
ValueCountFrequency (%)
( 145
100.0%
Space Separator
ValueCountFrequency (%)
13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27936
98.4%
Common 467
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
926
 
3.3%
908
 
3.3%
807
 
2.9%
707
 
2.5%
702
 
2.5%
615
 
2.2%
544
 
1.9%
538
 
1.9%
500
 
1.8%
441
 
1.6%
Other values (252) 21248
76.1%
Common
ValueCountFrequency (%)
) 145
31.0%
( 145
31.0%
3 110
23.6%
4 35
 
7.5%
5 19
 
4.1%
13
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27936
98.4%
ASCII 467
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
926
 
3.3%
908
 
3.3%
807
 
2.9%
707
 
2.5%
702
 
2.5%
615
 
2.2%
544
 
1.9%
538
 
1.9%
500
 
1.8%
441
 
1.6%
Other values (252) 21248
76.1%
ASCII
ValueCountFrequency (%)
) 145
31.0%
( 145
31.0%
3 110
23.6%
4 35
 
7.5%
5 19
 
4.1%
13
 
2.8%

승차총승객수
Real number (ℝ)

HIGH CORRELATION 

Distinct8335
Distinct (%)83.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13204.762
Minimum1
Maximum124139
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:16.345821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1101.95
Q14456.5
median9247.5
Q317124.25
95-th percentile38818.75
Maximum124139
Range124138
Interquartile range (IQR)12667.75

Descriptive statistics

Standard deviation13485.51
Coefficient of variation (CV)1.0212611
Kurtosis9.1112197
Mean13204.762
Median Absolute Deviation (MAD)5740
Skewness2.5082716
Sum1.3204762 × 108
Variance1.8185897 × 108
MonotonicityNot monotonic
2024-05-11T15:21:16.626351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 20
 
0.2%
2 11
 
0.1%
2808 6
 
0.1%
5601 5
 
0.1%
373 5
 
0.1%
2412 4
 
< 0.1%
5368 4
 
< 0.1%
4650 4
 
< 0.1%
5912 4
 
< 0.1%
7820 4
 
< 0.1%
Other values (8325) 9933
99.3%
ValueCountFrequency (%)
1 20
0.2%
2 11
0.1%
3 2
 
< 0.1%
4 1
 
< 0.1%
6 1
 
< 0.1%
15 1
 
< 0.1%
30 1
 
< 0.1%
31 1
 
< 0.1%
37 1
 
< 0.1%
39 1
 
< 0.1%
ValueCountFrequency (%)
124139 1
< 0.1%
122270 1
< 0.1%
117892 1
< 0.1%
115593 1
< 0.1%
111807 1
< 0.1%
109540 1
< 0.1%
109357 1
< 0.1%
109347 1
< 0.1%
106547 1
< 0.1%
106465 1
< 0.1%

하차총승객수
Real number (ℝ)

HIGH CORRELATION 

Distinct8348
Distinct (%)83.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13142.806
Minimum0
Maximum124517
Zeros35
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:16.966036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1064
Q14317.75
median9045
Q317123.5
95-th percentile39233.2
Maximum124517
Range124517
Interquartile range (IQR)12805.75

Descriptive statistics

Standard deviation13635.503
Coefficient of variation (CV)1.0374879
Kurtosis8.6844222
Mean13142.806
Median Absolute Deviation (MAD)5607.5
Skewness2.4726591
Sum1.3142806 × 108
Variance1.8592693 × 108
MonotonicityNot monotonic
2024-05-11T15:21:17.251946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 35
 
0.4%
8211 5
 
0.1%
2685 5
 
0.1%
4512 4
 
< 0.1%
958 4
 
< 0.1%
964 4
 
< 0.1%
625 4
 
< 0.1%
2957 4
 
< 0.1%
2615 4
 
< 0.1%
2213 4
 
< 0.1%
Other values (8338) 9927
99.3%
ValueCountFrequency (%)
0 35
0.4%
21 1
 
< 0.1%
22 1
 
< 0.1%
23 1
 
< 0.1%
27 1
 
< 0.1%
29 1
 
< 0.1%
30 3
 
< 0.1%
31 1
 
< 0.1%
32 2
 
< 0.1%
34 3
 
< 0.1%
ValueCountFrequency (%)
124517 1
< 0.1%
121767 1
< 0.1%
120455 1
< 0.1%
116654 1
< 0.1%
111527 1
< 0.1%
110346 1
< 0.1%
109698 1
< 0.1%
109468 1
< 0.1%
108548 1
< 0.1%
106917 1
< 0.1%

등록일자
Real number (ℝ)

HIGH CORRELATION 

Distinct182
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20160391
Minimum20160109
Maximum20160708
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:21:17.527369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20160109
5-th percentile20160118
Q120160222
median20160409
Q320160522
95-th percentile20160629
Maximum20160708
Range599
Interquartile range (IQR)300

Descriptive statistics

Standard deviation174.46244
Coefficient of variation (CV)8.6537229 × 10-6
Kurtosis-1.1520483
Mean20160391
Median Absolute Deviation (MAD)121
Skewness-0.010226496
Sum2.0160391 × 1011
Variance30437.143
MonotonicityNot monotonic
2024-05-11T15:21:17.904306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20160130 81
 
0.8%
20160407 72
 
0.7%
20160606 70
 
0.7%
20160125 68
 
0.7%
20160501 68
 
0.7%
20160605 67
 
0.7%
20160310 67
 
0.7%
20160504 67
 
0.7%
20160521 67
 
0.7%
20160618 66
 
0.7%
Other values (172) 9307
93.1%
ValueCountFrequency (%)
20160109 49
0.5%
20160110 41
0.4%
20160111 61
0.6%
20160112 60
0.6%
20160113 50
0.5%
20160114 55
0.5%
20160115 64
0.6%
20160116 61
0.6%
20160117 45
0.4%
20160118 50
0.5%
ValueCountFrequency (%)
20160708 17
 
0.2%
20160707 60
0.6%
20160706 55
0.5%
20160705 57
0.6%
20160704 63
0.6%
20160703 55
0.5%
20160702 46
0.5%
20160701 56
0.6%
20160630 60
0.6%
20160629 60
0.6%

Interactions

2024-05-11T15:21:12.646015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:09.588535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:10.373070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:11.290913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:12.981120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:09.791657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:10.605557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:11.500040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:13.243594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:09.992924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:10.892277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:11.708961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:13.446137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:10.171622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:11.083400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:21:12.081667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:21:18.100736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용일자노선명승차총승객수하차총승객수등록일자
사용일자1.0000.0000.0440.0600.969
노선명0.0001.0000.5090.5080.000
승차총승객수0.0440.5091.0000.9870.067
하차총승객수0.0600.5080.9871.0000.080
등록일자0.9690.0000.0670.0801.000
2024-05-11T15:21:18.284383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용일자승차총승객수하차총승객수등록일자노선명
사용일자1.0000.0580.0551.0000.000
승차총승객수0.0581.0000.9930.0580.214
하차총승객수0.0550.9931.0000.0550.213
등록일자1.0000.0580.0551.0000.000
노선명0.0000.2140.2130.0001.000

Missing values

2024-05-11T15:21:13.698320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:21:13.866598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사용일자노선명역명승차총승객수하차총승객수등록일자
37952201603109호선2단계언주8304841620160318
48630201603297호선하계245672347620160406
8851201601175호선송정6621710520160125
8152120160527분당선매교3392336120160604
2021520160206분당선서현196082012120160214
62299201604237호선대림101401126320160501
96837201606246호선화랑대146291064920160702
40594201603147호선보라매118601197520160322
13471201601255호선우장산164601635520160202
1225420160123경원선회룡105261010020160131
사용일자노선명역명승차총승객수하차총승객수등록일자
7853220160522경부선광명76352220160530
87442201606073호선남부터미널385494003120160615
5326020160406안산선고잔106511047620160414
2946520160223경부선가산디지털단지196252333920160302
6648220160430경부선명학8427820120160508
98781201606282호선구의286062792420160706
33697201603022호선당산258762919620160310
13807201601266호선불광5856567120160203
95944201606222호선사당483075324720160630
34331201603033호선무악재4954498920160311