Overview

Dataset statistics

Number of variables5
Number of observations88
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.7 KiB
Average record size in memory43.5 B

Variable types

Categorical3
Numeric2

Dataset

DescriptionKTX 승하차 데이터로, 평창역을 이용하는 인원에 대한 월별 정보를 제공합니다(상하행 구분, 2019년 1월부터 2022년 7월까지)
Author한국철도공사
URLhttps://www.data.go.kr/data/15106040/fileData.do

Alerts

단위 has constant value ""Constant
승차인원수 is highly overall correlated with 상행하행구분High correlation
하차인원수 is highly overall correlated with 상행하행구분High correlation
상행하행구분 is highly overall correlated with 승차인원수 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 08:56:40.452640
Analysis finished2023-12-12 08:56:41.188508
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

운행년월
Categorical

Distinct43
Distinct (%)48.9%
Missing0
Missing (%)0.0%
Memory size836.0 B
2021-08
 
4
2019-12
 
2
2019-03
 
2
2019-04
 
2
2019-05
 
2
Other values (38)
76 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019-01
2nd row2019-01
3rd row2019-02
4th row2019-02
5th row2019-03

Common Values

ValueCountFrequency (%)
2021-08 4
 
4.5%
2019-12 2
 
2.3%
2019-03 2
 
2.3%
2019-04 2
 
2.3%
2019-05 2
 
2.3%
2019-06 2
 
2.3%
2019-07 2
 
2.3%
2019-08 2
 
2.3%
2019-09 2
 
2.3%
2019-10 2
 
2.3%
Other values (33) 66
75.0%

Length

2023-12-12T17:56:41.276503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2021-08 4
 
4.5%
2020-11 2
 
2.3%
2021-09 2
 
2.3%
2021-01 2
 
2.3%
2021-02 2
 
2.3%
2021-03 2
 
2.3%
2021-04 2
 
2.3%
2021-05 2
 
2.3%
2021-06 2
 
2.3%
2021-07 2
 
2.3%
Other values (33) 66
75.0%

상행하행구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size836.0 B
하행
44 
상행
44 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하행
2nd row상행
3rd row하행
4th row상행
5th row하행

Common Values

ValueCountFrequency (%)
하행 44
50.0%
상행 44
50.0%

Length

2023-12-12T17:56:41.408389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:56:41.539061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
하행 44
50.0%
상행 44
50.0%

단위
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size836.0 B
88 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
88
100.0%

Length

2023-12-12T17:56:41.664086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:56:41.763899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
88
100.0%

승차인원수
Real number (ℝ)

HIGH CORRELATION 

Distinct87
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3746.1591
Minimum68
Maximum10014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size924.0 B
2023-12-12T17:56:42.301301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum68
5-th percentile742.7
Q11087.5
median1566.5
Q36450.75
95-th percentile8264.05
Maximum10014
Range9946
Interquartile range (IQR)5363.25

Descriptive statistics

Standard deviation3003.0956
Coefficient of variation (CV)0.80164658
Kurtosis-1.3607848
Mean3746.1591
Median Absolute Deviation (MAD)1354.5
Skewness0.45150299
Sum329662
Variance9018583.4
MonotonicityNot monotonic
2023-12-12T17:56:42.458695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
912 2
 
2.3%
1164 1
 
1.1%
5658 1
 
1.1%
7206 1
 
1.1%
960 1
 
1.1%
83 1
 
1.1%
68 1
 
1.1%
6477 1
 
1.1%
1047 1
 
1.1%
6151 1
 
1.1%
Other values (77) 77
87.5%
ValueCountFrequency (%)
68 1
1.1%
83 1
1.1%
654 1
1.1%
712 1
1.1%
742 1
1.1%
744 1
1.1%
749 1
1.1%
844 1
1.1%
895 1
1.1%
908 1
1.1%
ValueCountFrequency (%)
10014 1
1.1%
9682 1
1.1%
9504 1
1.1%
8688 1
1.1%
8270 1
1.1%
8253 1
1.1%
8163 1
1.1%
8161 1
1.1%
8112 1
1.1%
8030 1
1.1%

하차인원수
Real number (ℝ)

HIGH CORRELATION 

Distinct86
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3834.2614
Minimum68
Maximum10190
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size924.0 B
2023-12-12T17:56:42.682860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum68
5-th percentile750.15
Q11089.75
median1589
Q36653.25
95-th percentile8619.85
Maximum10190
Range10122
Interquartile range (IQR)5563.5

Descriptive statistics

Standard deviation3117.9989
Coefficient of variation (CV)0.81319414
Kurtosis-1.3539425
Mean3834.2614
Median Absolute Deviation (MAD)1345.5
Skewness0.45765571
Sum337415
Variance9721916.9
MonotonicityNot monotonic
2023-12-12T17:56:42.902201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1194 2
 
2.3%
1098 2
 
2.3%
8025 1
 
1.1%
6365 1
 
1.1%
6929 1
 
1.1%
68 1
 
1.1%
83 1
 
1.1%
1017 1
 
1.1%
7215 1
 
1.1%
1013 1
 
1.1%
Other values (76) 76
86.4%
ValueCountFrequency (%)
68 1
1.1%
83 1
1.1%
599 1
1.1%
726 1
1.1%
747 1
1.1%
756 1
1.1%
773 1
1.1%
830 1
1.1%
850 1
1.1%
854 1
1.1%
ValueCountFrequency (%)
10190 1
1.1%
9998 1
1.1%
9996 1
1.1%
9226 1
1.1%
8637 1
1.1%
8588 1
1.1%
8440 1
1.1%
8328 1
1.1%
8272 1
1.1%
8144 1
1.1%

Interactions

2023-12-12T17:56:40.823864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:56:40.625731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:56:40.922039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:56:40.722442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:56:43.037905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운행년월상행하행구분승차인원수하차인원수
운행년월1.0000.0000.0000.000
상행하행구분0.0001.0000.9990.999
승차인원수0.0000.9991.0000.793
하차인원수0.0000.9990.7931.000
2023-12-12T17:56:43.172637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
운행년월상행하행구분
운행년월1.0000.000
상행하행구분0.0001.000
2023-12-12T17:56:43.292810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승차인원수하차인원수운행년월상행하행구분
승차인원수1.000-0.4280.0000.929
하차인원수-0.4281.0000.0000.929
운행년월0.0000.0001.0000.000
상행하행구분0.9290.9290.0001.000

Missing values

2023-12-12T17:56:41.036022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:56:41.147251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운행년월상행하행구분단위승차인원수하차인원수
02019-01하행11648025
12019-01상행81611216
22019-02하행12477766
32019-02상행77491234
42019-03하행11545313
52019-03상행55291098
62019-04하행12825932
72019-04상행57951205
82019-05하행14757787
92019-05상행80121481
운행년월상행하행구분단위승차인원수하차인원수
782022-03하행9154392
792022-03상행4560854
802022-04하행10865920
812022-04상행56721129
822022-05하행13468272
832022-05상행81631343
842022-06하행13668588
852022-06상행81121453
862022-07하행14149226
872022-07상행86881338