Overview

Dataset statistics

Number of variables5
Number of observations276
Missing cells662
Missing cells (%)48.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.0 KiB
Average record size in memory44.5 B

Variable types

Text1
Numeric4

Dataset

Description역별 하차 여객수송(고속열차, 새마을 등) 실적 입니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15068458/fileData.do

Alerts

새마을 is highly overall correlated with 무궁화High correlation
무궁화 is highly overall correlated with 새마을 and 1 other fieldsHigh correlation
통근열차 is highly overall correlated with 무궁화High correlation
고속열차 has 225 (81.5%) missing valuesMissing
새마을 has 128 (46.4%) missing valuesMissing
무궁화 has 43 (15.6%) missing valuesMissing
통근열차 has 266 (96.4%) missing valuesMissing
has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:49:23.141421
Analysis finished2023-12-12 23:49:25.215781
Duration2.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2023-12-13T08:49:25.497707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.2898551
Min length2

Characters and Unicode

Total characters632
Distinct characters194
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique276 ?
Unique (%)100.0%

Sample

1st row서울
2nd row용산
3rd row수색
4th row행신
5th row문산
ValueCountFrequency (%)
서울 1
 
0.4%
득량 1
 
0.4%
진상 1
 
0.4%
광양 1
 
0.4%
벌교 1
 
0.4%
조성 1
 
0.4%
예당 1
 
0.4%
순천 1
 
0.4%
명봉 1
 
0.4%
구례구 1
 
0.4%
Other values (266) 266
96.4%
2023-12-13T08:49:25.985128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.8%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (184) 470
74.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 630
99.7%
Decimal Number 1
 
0.2%
Uppercase Letter 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 630
99.7%
Common 1
 
0.2%
Latin 1
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
Common
ValueCountFrequency (%)
2 1
100.0%
Latin
ValueCountFrequency (%)
T 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 630
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
4.9%
22
 
3.5%
19
 
3.0%
18
 
2.9%
14
 
2.2%
13
 
2.1%
13
 
2.1%
12
 
1.9%
10
 
1.6%
10
 
1.6%
Other values (182) 468
74.3%
ASCII
ValueCountFrequency (%)
2 1
50.0%
T 1
50.0%

고속열차
Real number (ℝ)

MISSING 

Distinct51
Distinct (%)100.0%
Missing225
Missing (%)81.5%
Infinite0
Infinite (%)0.0%
Mean1314533.2
Minimum13990
Maximum14181231
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T08:49:26.117391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum13990
5-th percentile39147.5
Q1117435.5
median390867
Q3979533.5
95-th percentile5439788.5
Maximum14181231
Range14167241
Interquartile range (IQR)862098

Descriptive statistics

Standard deviation2427984.5
Coefficient of variation (CV)1.8470317
Kurtosis15.554503
Mean1314533.2
Median Absolute Deviation (MAD)315990
Skewness3.5361466
Sum67041194
Variance5.8951089 × 1012
MonotonicityNot monotonic
2023-12-13T08:49:26.237819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5137021 1
 
0.4%
1333029 1
 
0.4%
287237 1
 
0.4%
858979 1
 
0.4%
173569 1
 
0.4%
42715 1
 
0.4%
35964 1
 
0.4%
691601 1
 
0.4%
212995 1
 
0.4%
527737 1
 
0.4%
Other values (41) 41
 
14.9%
(Missing) 225
81.5%
ValueCountFrequency (%)
13990 1
0.4%
20116 1
0.4%
35964 1
0.4%
42331 1
0.4%
42715 1
0.4%
54981 1
0.4%
74877 1
0.4%
83105 1
0.4%
83141 1
0.4%
92767 1
0.4%
ValueCountFrequency (%)
14181231 1
0.4%
6051142 1
0.4%
5742556 1
0.4%
5137021 1
0.4%
4903041 1
0.4%
4738698 1
0.4%
3277052 1
0.4%
2743629 1
0.4%
2129478 1
0.4%
2123859 1
0.4%

새마을
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct145
Distinct (%)98.0%
Missing128
Missing (%)46.4%
Infinite0
Infinite (%)0.0%
Mean72648.081
Minimum2
Maximum1376772
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T08:49:26.362365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile44.45
Q1553.75
median5388.5
Q344679.25
95-th percentile464317.2
Maximum1376772
Range1376770
Interquartile range (IQR)44125.5

Descriptive statistics

Standard deviation181622.89
Coefficient of variation (CV)2.500037
Kurtosis21.906715
Mean72648.081
Median Absolute Deviation (MAD)5329
Skewness4.2244665
Sum10751916
Variance3.2986875 × 1010
MonotonicityNot monotonic
2023-12-13T08:49:26.489626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 2
 
0.7%
335 2
 
0.7%
63 2
 
0.7%
75283 1
 
0.4%
54527 1
 
0.4%
215199 1
 
0.4%
664630 1
 
0.4%
2218 1
 
0.4%
27180 1
 
0.4%
443135 1
 
0.4%
Other values (135) 135
48.9%
(Missing) 128
46.4%
ValueCountFrequency (%)
2 2
0.7%
5 1
0.4%
11 1
0.4%
24 1
0.4%
30 1
0.4%
35 1
0.4%
42 1
0.4%
49 1
0.4%
56 1
0.4%
63 2
0.7%
ValueCountFrequency (%)
1376772 1
0.4%
910171 1
0.4%
674079 1
0.4%
664630 1
0.4%
545770 1
0.4%
536832 1
0.4%
508801 1
0.4%
475723 1
0.4%
443135 1
0.4%
367397 1
0.4%

무궁화
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct233
Distinct (%)100.0%
Missing43
Missing (%)15.6%
Infinite0
Infinite (%)0.0%
Mean242927.06
Minimum8
Maximum4416588
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T08:49:26.619393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile480.6
Q16301
median38331
Q3146009
95-th percentile1505355
Maximum4416588
Range4416580
Interquartile range (IQR)139708

Descriptive statistics

Standard deviation582591.1
Coefficient of variation (CV)2.3982141
Kurtosis18.946297
Mean242927.06
Median Absolute Deviation (MAD)36758
Skewness4.0434009
Sum56602006
Variance3.394124 × 1011
MonotonicityNot monotonic
2023-12-13T08:49:26.742291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4378 1
 
0.4%
32951 1
 
0.4%
1305 1
 
0.4%
696 1
 
0.4%
771043 1
 
0.4%
1919822 1
 
0.4%
16924 1
 
0.4%
65691 1
 
0.4%
876399 1
 
0.4%
3377 1
 
0.4%
Other values (223) 223
80.8%
(Missing) 43
 
15.6%
ValueCountFrequency (%)
8 1
0.4%
188 1
0.4%
220 1
0.4%
279 1
0.4%
325 1
0.4%
335 1
0.4%
400 1
0.4%
410 1
0.4%
430 1
0.4%
454 1
0.4%
ValueCountFrequency (%)
4416588 1
0.4%
3442822 1
0.4%
2854646 1
0.4%
2407004 1
0.4%
2363879 1
0.4%
2354296 1
0.4%
2342738 1
0.4%
2173175 1
0.4%
1919822 1
0.4%
1668314 1
0.4%

통근열차
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)100.0%
Missing266
Missing (%)96.4%
Infinite0
Infinite (%)0.0%
Mean38969.3
Minimum130
Maximum186309
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-13T08:49:26.839265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum130
5-th percentile148
Q11231
median2285
Q359225
95-th percentile153388.35
Maximum186309
Range186179
Interquartile range (IQR)57994

Descriptive statistics

Standard deviation65259.971
Coefficient of variation (CV)1.6746508
Kurtosis1.883231
Mean38969.3
Median Absolute Deviation (MAD)2135
Skewness1.6505504
Sum389693
Variance4.2588638 × 109
MonotonicityNot monotonic
2023-12-13T08:49:26.930417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
186309 1
 
0.4%
2163 1
 
0.4%
130 1
 
0.4%
170 1
 
0.4%
5333 1
 
0.4%
77189 1
 
0.4%
1042 1
 
0.4%
1798 1
 
0.4%
2407 1
 
0.4%
113152 1
 
0.4%
(Missing) 266
96.4%
ValueCountFrequency (%)
130 1
0.4%
170 1
0.4%
1042 1
0.4%
1798 1
0.4%
2163 1
0.4%
2407 1
0.4%
5333 1
0.4%
77189 1
0.4%
113152 1
0.4%
186309 1
0.4%
ValueCountFrequency (%)
186309 1
0.4%
113152 1
0.4%
77189 1
0.4%
5333 1
0.4%
2407 1
0.4%
2163 1
0.4%
1798 1
0.4%
1042 1
0.4%
170 1
0.4%
130 1
0.4%

Interactions

2023-12-13T08:49:24.282358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.311986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.652967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.963803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.354869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.426392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.736768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.051548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.443253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.523299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.815069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.132875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.528103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.590681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:23.884118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:49:24.202043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:49:26.999652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고속열차새마을무궁화통근열차
고속열차1.0000.5020.683NaN
새마을0.5021.0000.893NaN
무궁화0.6830.8931.000NaN
통근열차NaNNaNNaN1.000
2023-12-13T08:49:27.096156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고속열차새마을무궁화통근열차
고속열차1.0000.4150.448NaN
새마을0.4151.0000.6350.486
무궁화0.4480.6351.0001.000
통근열차NaN0.4861.0001.000

Missing values

2023-12-13T08:49:24.656090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:49:24.749769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T08:49:25.152403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

고속열차새마을무궁화통근열차
0서울141812315457702363879<NA>
1용산51370215088011534713<NA>
2수색<NA><NA>1315<NA>
3행신721378<NA><NA><NA>
4문산<NA>1598814<NA>
5운천<NA>5<NA><NA>
6임진강<NA>1492864<NA>
7도라산<NA>20781<NA><NA>
8인천공항74877<NA><NA><NA>
9검암20116<NA><NA><NA>
고속열차새마을무궁화통근열차
266완사<NA><NA>3886<NA>
267북천<NA>206815590<NA>
268양보<NA><NA><NA><NA>
269횡천<NA><NA>7168<NA>
270하동<NA>222531535<NA>
271울산2129478<NA><NA><NA>
272진례<NA><NA>3473<NA>
273창원중앙77516745363249890<NA>
274가야<NA><NA><NA><NA>
275신해운대<NA>36695190735<NA>