Overview

Dataset statistics

Number of variables4
Number of observations601
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.1 KiB
Average record size in memory34.2 B

Variable types

Categorical2
Numeric2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15596/F/1/datasetView.do

Alerts

대중교통구분 has constant value ""Constant
승차총승객수 is highly overall correlated with 노선명High correlation
노선명 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has unique valuesUnique

Reproduction

Analysis started2024-03-30 05:46:04.825364
Analysis finished2024-03-30 05:46:07.821022
Duration3 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대중교통구분
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
지하철
601 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하철
2nd row지하철
3rd row지하철
4th row지하철
5th row지하철

Common Values

ValueCountFrequency (%)
지하철 601
100.0%

Length

2024-03-30T05:46:08.043852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T05:46:08.476087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하철 601
100.0%

노선명
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
1호선
 
19
2호선
 
19
우이신설선
 
19
3호선
 
19
4호선
 
19
Other values (27)
506 

Length

Max length8
Median length3
Mean length3.7587354
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 19
 
3.2%
2호선 19
 
3.2%
우이신설선 19
 
3.2%
3호선 19
 
3.2%
4호선 19
 
3.2%
경부선 19
 
3.2%
경인선 19
 
3.2%
경원선 19
 
3.2%
안산선 19
 
3.2%
과천선 19
 
3.2%
Other values (22) 411
68.4%

Length

2024-03-30T05:46:09.137616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1호선 38
 
6.1%
2호선 19
 
3.1%
경기철도 19
 
3.1%
의정부경전철 19
 
3.1%
용인에버라인 19
 
3.1%
9호선2~3단계 19
 
3.1%
신분당선 19
 
3.1%
공항철도 19
 
3.1%
9호선 19
 
3.1%
인천2호선 19
 
3.1%
Other values (22) 411
66.3%

년월
Real number (ℝ)

Distinct19
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201822.42
Minimum201711
Maximum201905
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-03-30T05:46:09.662241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201711
5-th percentile201711
Q1201803
median201808
Q3201901
95-th percentile201905
Maximum201905
Range194
Interquartile range (IQR)98

Descriptive statistics

Standard deviation56.30338
Coefficient of variation (CV)0.00027897485
Kurtosis-0.30545952
Mean201822.42
Median Absolute Deviation (MAD)5
Skewness-0.02388867
Sum1.2129527 × 108
Variance3170.0706
MonotonicityNot monotonic
2024-03-30T05:46:10.111148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
201809 32
 
5.3%
201811 32
 
5.3%
201905 32
 
5.3%
201904 32
 
5.3%
201903 32
 
5.3%
201902 32
 
5.3%
201901 32
 
5.3%
201806 32
 
5.3%
201807 32
 
5.3%
201808 32
 
5.3%
Other values (9) 281
46.8%
ValueCountFrequency (%)
201711 31
5.2%
201712 31
5.2%
201801 31
5.2%
201802 31
5.2%
201803 31
5.2%
201804 31
5.2%
201805 31
5.2%
201806 32
5.3%
201807 32
5.3%
201808 32
5.3%
ValueCountFrequency (%)
201905 32
5.3%
201904 32
5.3%
201903 32
5.3%
201902 32
5.3%
201901 32
5.3%
201812 32
5.3%
201811 32
5.3%
201810 32
5.3%
201809 32
5.3%
201808 32
5.3%

승차총승객수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct601
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7542216.7
Minimum343785
Maximum49356486
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.4 KiB
2024-03-30T05:46:10.690078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum343785
5-th percentile643464
Q11403115
median3537142
Q310346038
95-th percentile21467155
Maximum49356486
Range49012701
Interquartile range (IQR)8942923

Descriptive statistics

Standard deviation9225438.1
Coefficient of variation (CV)1.2231733
Kurtosis7.8861823
Mean7542216.7
Median Absolute Deviation (MAD)2660743
Skewness2.5618347
Sum4.5328723 × 109
Variance8.5108708 × 1013
MonotonicityNot monotonic
2024-03-30T05:46:11.299757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8633618 1
 
0.2%
4772563 1
 
0.2%
5793719 1
 
0.2%
5916680 1
 
0.2%
5523152 1
 
0.2%
5630084 1
 
0.2%
5445725 1
 
0.2%
5176534 1
 
0.2%
5973011 1
 
0.2%
5944278 1
 
0.2%
Other values (591) 591
98.3%
ValueCountFrequency (%)
343785 1
0.2%
364510 1
0.2%
374659 1
0.2%
396597 1
0.2%
397622 1
0.2%
399577 1
0.2%
409923 1
0.2%
465508 1
0.2%
469841 1
0.2%
470318 1
0.2%
ValueCountFrequency (%)
49356486 1
0.2%
49033111 1
0.2%
48343358 1
0.2%
48332253 1
0.2%
48288516 1
0.2%
48249398 1
0.2%
48093446 1
0.2%
48049524 1
0.2%
47703690 1
0.2%
47356791 1
0.2%

Interactions

2024-03-30T05:46:06.159150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T05:46:05.199827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T05:46:06.516510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T05:46:05.724983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-30T05:46:11.622459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선명년월승차총승객수
노선명1.0000.0000.966
년월0.0001.0000.000
승차총승객수0.9660.0001.000
2024-03-30T05:46:11.902689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년월승차총승객수노선명
년월1.0000.0180.000
승차총승객수0.0181.0000.818
노선명0.0000.8181.000

Missing values

2024-03-30T05:46:07.123916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-30T05:46:07.716758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대중교통구분노선명년월승차총승객수
0지하철1호선2017118633618
1지하철1호선2017128737235
2지하철1호선2018018145989
3지하철1호선2018027273309
4지하철1호선2018038692551
5지하철1호선2018048275767
6지하철1호선2018058543247
7지하철1호선2018067972991
8지하철1호선2018078150061
9지하철1호선2018087930624
대중교통구분노선명년월승차총승객수
591지하철우이신설선2018081222945
592지하철우이신설선2018091249766
593지하철우이신설선2018101400225
594지하철우이신설선2018111366119
595지하철우이신설선2018121301315
596지하철우이신설선2019011263643
597지하철우이신설선2019021102109
598지하철우이신설선2019031402393
599지하철우이신설선2019041403115
600지하철우이신설선2019051469681