Overview

Dataset statistics

Number of variables5
Number of observations276
Missing cells661
Missing cells (%)47.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.0 KiB
Average record size in memory44.5 B

Variable types

Text1
Numeric4

Dataset

Description역별 승차 여객수송(고속열차, 새마을 등) 실적 입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068456/fileData.do

Alerts

새마을 is highly overall correlated with 무궁화 and 1 other fieldsHigh correlation
무궁화 is highly overall correlated with 새마을 and 1 other fieldsHigh correlation
통근열차 is highly overall correlated with 새마을 and 1 other fieldsHigh correlation
고속열차 has 225 (81.5%) missing valuesMissing
새마을 has 128 (46.4%) missing valuesMissing
무궁화 has 42 (15.2%) missing valuesMissing
통근열차 has 266 (96.4%) missing valuesMissing
has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:13:29.686550
Analysis finished2023-12-12 17:13:32.674007
Duration2.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-13T02:13:33.036364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.2898551
Min length2

Characters and Unicode

Total characters632
Distinct characters194
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique276 ?
Unique (%)100.0%

Sample

1st row서울
2nd row용산
3rd row수색
4th row행신
5th row문산
ValueCountFrequency (%)
서울 1
 
0.4%
득량 1
 
0.4%
진상 1
 
0.4%
광양 1
 
0.4%
벌교 1
 
0.4%
조성 1
 
0.4%
예당 1
 
0.4%
순천 1
 
0.4%
명봉 1
 
0.4%
구례구 1
 
0.4%
Other values (266) 266
96.4%
2023-12-13T02:13:33.592876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.8%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (184) 470
74.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 630
99.7%
Decimal Number 1
 
0.2%
Uppercase Letter 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 630
99.7%
Common 1
 
0.2%
Latin 1
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
Common
ValueCountFrequency (%)
2 1
100.0%
Latin
ValueCountFrequency (%)
T 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 630
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
ASCII
ValueCountFrequency (%)
2 1
50.0%
T 1
50.0%

고속열차
Real number (ℝ)

MISSING 

Distinct51
Distinct (%)100.0%
Missing225
Missing (%)81.5%
Infinite0
Infinite (%)0.0%
Mean1314533.2
Minimum14696
Maximum13821606
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T02:13:33.738826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14696
5-th percentile36520
Q1124899
median408956
Q31035383.5
95-th percentile5375036
Maximum13821606
Range13806910
Interquartile range (IQR)910484.5

Descriptive statistics

Standard deviation2385772.9
Coefficient of variation (CV)1.8149202
Kurtosis14.858069
Mean1314533.2
Median Absolute Deviation (MAD)323527
Skewness3.4548526
Sum67041194
Variance5.6919121 × 1012
MonotonicityNot monotonic
2023-12-13T02:13:33.888802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4923025 1
 
0.4%
1372368 1
 
0.4%
292509 1
 
0.4%
883019 1
 
0.4%
175482 1
 
0.4%
37485 1
 
0.4%
35555 1
 
0.4%
692023 1
 
0.4%
220116 1
 
0.4%
506540 1
 
0.4%
Other values (41) 41
 
14.9%
(Missing) 225
81.5%
ValueCountFrequency (%)
14696 1
0.4%
21738 1
0.4%
35555 1
0.4%
37485 1
0.4%
48959 1
0.4%
52921 1
0.4%
60153 1
0.4%
85429 1
0.4%
86535 1
0.4%
96325 1
0.4%
ValueCountFrequency (%)
13821606 1
0.4%
6054561 1
0.4%
5827047 1
0.4%
4923025 1
0.4%
4888131 1
0.4%
4696528 1
0.4%
3326931 1
0.4%
2839965 1
0.4%
2136326 1
0.4%
2106885 1
0.4%

새마을
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct145
Distinct (%)98.0%
Missing128
Missing (%)46.4%
Infinite0
Infinite (%)0.0%
Mean72648.081
Minimum2
Maximum1319446
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T02:13:34.067726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile48.35
Q1619
median5387.5
Q346115.25
95-th percentile462855.6
Maximum1319446
Range1319444
Interquartile range (IQR)45496.25

Descriptive statistics

Standard deviation178721.88
Coefficient of variation (CV)2.4601046
Kurtosis19.917249
Mean72648.081
Median Absolute Deviation (MAD)5330
Skewness4.0576903
Sum10751916
Variance3.194151 × 1010
MonotonicityNot monotonic
2023-12-13T02:13:34.224550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
335 3
 
1.1%
2 2
 
0.7%
463375 1
 
0.4%
655 1
 
0.4%
2626 1
 
0.4%
193365 1
 
0.4%
627179 1
 
0.4%
24830 1
 
0.4%
461891 1
 
0.4%
1366 1
 
0.4%
Other values (135) 135
48.9%
(Missing) 128
46.4%
ValueCountFrequency (%)
2 2
0.7%
11 1
0.4%
15 1
0.4%
24 1
0.4%
29 1
0.4%
30 1
0.4%
48 1
0.4%
49 1
0.4%
50 1
0.4%
56 1
0.4%
ValueCountFrequency (%)
1319446 1
0.4%
881101 1
0.4%
688029 1
0.4%
627179 1
0.4%
603409 1
0.4%
530128 1
0.4%
511593 1
0.4%
463375 1
0.4%
461891 1
0.4%
380759 1
0.4%

무궁화
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct233
Distinct (%)99.6%
Missing42
Missing (%)15.2%
Infinite0
Infinite (%)0.0%
Mean241888.91
Minimum188
Maximum4400814
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T02:13:34.377998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum188
5-th percentile532.65
Q16525.75
median36020.5
Q3142526.75
95-th percentile1501243
Maximum4400814
Range4400626
Interquartile range (IQR)136001

Descriptive statistics

Standard deviation582598.75
Coefficient of variation (CV)2.4085385
Kurtosis19.199292
Mean241888.91
Median Absolute Deviation (MAD)34491
Skewness4.0694235
Sum56602006
Variance3.3942131 × 1011
MonotonicityNot monotonic
2023-12-13T02:13:34.521307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
480 2
 
0.7%
3998 1
 
0.4%
1555 1
 
0.4%
795220 1
 
0.4%
1916700 1
 
0.4%
17292 1
 
0.4%
51521 1
 
0.4%
889205 1
 
0.4%
3831 1
 
0.4%
2339224 1
 
0.4%
Other values (223) 223
80.8%
(Missing) 42
 
15.2%
ValueCountFrequency (%)
188 1
0.4%
321 1
0.4%
339 1
0.4%
361 1
0.4%
400 1
0.4%
405 1
0.4%
410 1
0.4%
430 1
0.4%
454 1
0.4%
464 1
0.4%
ValueCountFrequency (%)
4400814 1
0.4%
3544084 1
0.4%
2873827 1
0.4%
2475664 1
0.4%
2341165 1
0.4%
2339224 1
0.4%
2281132 1
0.4%
2134671 1
0.4%
1916700 1
0.4%
1610274 1
0.4%

통근열차
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing266
Missing (%)96.4%
Infinite0
Infinite (%)0.0%
Mean38969.3
Minimum323
Maximum173913
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T02:13:34.645283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum323
5-th percentile397.7
Q11027.75
median15183.5
Q360966.25
95-th percentile132040.5
Maximum173913
Range173590
Interquartile range (IQR)59938.5

Descriptive statistics

Standard deviation55919.551
Coefficient of variation (CV)1.4349642
Kurtosis3.3666811
Mean38969.3
Median Absolute Deviation (MAD)14777.5
Skewness1.8328648
Sum389693
Variance3.1269962 × 109
MonotonicityNot monotonic
2023-12-13T02:13:34.756399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
173913 1
 
0.4%
8210 1
 
0.4%
489 1
 
0.4%
323 1
 
0.4%
30046 1
 
0.4%
80863 1
 
0.4%
846 1
 
0.4%
1573 1
 
0.4%
22157 1
 
0.4%
71273 1
 
0.4%
(Missing) 266
96.4%
ValueCountFrequency (%)
323 1
0.4%
489 1
0.4%
846 1
0.4%
1573 1
0.4%
8210 1
0.4%
22157 1
0.4%
30046 1
0.4%
71273 1
0.4%
80863 1
0.4%
173913 1
0.4%
ValueCountFrequency (%)
173913 1
0.4%
80863 1
0.4%
71273 1
0.4%
30046 1
0.4%
22157 1
0.4%
8210 1
0.4%
1573 1
0.4%
846 1
0.4%
489 1
0.4%
323 1
0.4%

Interactions

2023-12-13T02:13:31.873041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:29.954443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.472044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.964395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:31.994884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.096523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.598537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:31.481159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:32.117915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.239474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.730427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:31.608995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:32.225217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.360493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:30.852612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:13:31.722104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:13:34.853973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고속열차새마을무궁화통근열차
고속열차1.0000.6930.612NaN
새마을0.6931.0000.891NaN
무궁화0.6120.8911.000NaN
통근열차NaNNaNNaN1.000
2023-12-13T02:13:34.962552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고속열차새마을무궁화통근열차
고속열차1.0000.4070.461NaN
새마을0.4071.0000.6530.771
무궁화0.4610.6531.0001.000
통근열차NaN0.7711.0001.000

Missing values

2023-12-13T02:13:32.358926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:13:32.467930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T02:13:32.609218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

고속열차새마을무궁화통근열차
0서울138216065115932475664<NA>
1용산49230255301281560979<NA>
2수색<NA><NA>1303<NA>
3행신770568<NA><NA><NA>
4문산<NA>1757814<NA>
5운천<NA>15<NA><NA>
6임진강<NA>1137480<NA>
7도라산<NA>20431<NA><NA>
8인천공항60153<NA><NA><NA>
9검암21738<NA><NA><NA>
고속열차새마을무궁화통근열차
266완사<NA><NA>3596<NA>
267북천<NA>182815697<NA>
268양보<NA><NA><NA><NA>
269횡천<NA><NA>6741<NA>
270하동<NA>267129887<NA>
271울산2136326<NA><NA><NA>
272진례<NA><NA>3879<NA>
273창원중앙76327760669252290<NA>
274가야<NA><NA><NA><NA>
275신해운대<NA>31652171642<NA>