Overview

Dataset statistics

Number of variables3
Number of observations293
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory26.5 B

Variable types

Text1
Numeric2

Dataset

Description한국철도공사의 광역철도 역별 승하차 실적에 대한 자료로 2021년 경부선, 경인선, 경원선, 중앙선, 안산선, 과천선, 분당선, 일산선, 장항선, 경강선, 서해선, 동해선 역별 승하차 실적을 표기하였습니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15100373/fileData.do

Alerts

승차 is highly overall correlated with 하차High correlation
하차 is highly overall correlated with 승차High correlation
역명 has unique valuesUnique
승차 has unique valuesUnique
하차 has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:43:00.428666
Analysis finished2023-12-12 07:43:01.048783
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

역명
Text

UNIQUE 

Distinct293
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T16:43:01.369722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.6518771
Min length2

Characters and Unicode

Total characters777
Distinct characters211
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique293 ?
Unique (%)100.0%

Sample

1st row서울
2nd row남영
3rd row용산
4th row노량진
5th row대방
ValueCountFrequency (%)
서울 1
 
0.3%
서현 1
 
0.3%
일산 1
 
0.3%
풍산 1
 
0.3%
백마 1
 
0.3%
곡산 1
 
0.3%
능곡 1
 
0.3%
행신 1
 
0.3%
강매 1
 
0.3%
화전 1
 
0.3%
Other values (283) 283
96.6%
2023-12-12T16:43:01.920792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
3.2%
25
 
3.2%
22
 
2.8%
20
 
2.6%
17
 
2.2%
17
 
2.2%
15
 
1.9%
13
 
1.7%
) 13
 
1.7%
( 13
 
1.7%
Other values (201) 597
76.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 751
96.7%
Close Punctuation 13
 
1.7%
Open Punctuation 13
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
3.3%
25
 
3.3%
22
 
2.9%
20
 
2.7%
17
 
2.3%
17
 
2.3%
15
 
2.0%
13
 
1.7%
12
 
1.6%
11
 
1.5%
Other values (199) 574
76.4%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 751
96.7%
Common 26
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
3.3%
25
 
3.3%
22
 
2.9%
20
 
2.7%
17
 
2.3%
17
 
2.3%
15
 
2.0%
13
 
1.7%
12
 
1.6%
11
 
1.5%
Other values (199) 574
76.4%
Common
ValueCountFrequency (%)
) 13
50.0%
( 13
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 751
96.7%
ASCII 26
 
3.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
25
 
3.3%
25
 
3.3%
22
 
2.9%
20
 
2.7%
17
 
2.3%
17
 
2.3%
15
 
2.0%
13
 
1.7%
12
 
1.6%
11
 
1.5%
Other values (199) 574
76.4%
ASCII
ValueCountFrequency (%)
) 13
50.0%
( 13
50.0%

승차
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct293
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2104072.5
Minimum551
Maximum12003970
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:43:02.415051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum551
5-th percentile82714.6
Q1597992
median1355789
Q32774218
95-th percentile6725669.4
Maximum12003970
Range12003419
Interquartile range (IQR)2176226

Descriptive statistics

Standard deviation2118954.7
Coefficient of variation (CV)1.007073
Kurtosis3.2847167
Mean2104072.5
Median Absolute Deviation (MAD)900720
Skewness1.7381408
Sum6.1649325 × 108
Variance4.4899689 × 1012
MonotonicityNot monotonic
2023-12-12T16:43:02.660909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5557772 1
 
0.3%
1119845 1
 
0.3%
2774218 1
 
0.3%
1693948 1
 
0.3%
2035063 1
 
0.3%
176518 1
 
0.3%
1151451 1
 
0.3%
2323851 1
 
0.3%
937091 1
 
0.3%
511305 1
 
0.3%
Other values (283) 283
96.6%
ValueCountFrequency (%)
551 1
0.3%
714 1
0.3%
1542 1
0.3%
1706 1
0.3%
2086 1
0.3%
3100 1
0.3%
4650 1
0.3%
10159 1
0.3%
21632 1
0.3%
22280 1
0.3%
ValueCountFrequency (%)
12003970 1
0.3%
10536843 1
0.3%
10264638 1
0.3%
9090902 1
0.3%
8279241 1
0.3%
8066130 1
0.3%
7883128 1
0.3%
7881304 1
0.3%
7723765 1
0.3%
7713308 1
0.3%

하차
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct293
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2094576.9
Minimum580
Maximum12336747
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T16:43:02.844001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum580
5-th percentile68909
Q1594242
median1349411
Q32762167
95-th percentile7077922
Maximum12336747
Range12336167
Interquartile range (IQR)2167925

Descriptive statistics

Standard deviation2158156.3
Coefficient of variation (CV)1.0303543
Kurtosis3.4694586
Mean2094576.9
Median Absolute Deviation (MAD)900311
Skewness1.7896499
Sum6.1371102 × 108
Variance4.6576387 × 1012
MonotonicityNot monotonic
2023-12-12T16:43:02.990554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3948265 1
 
0.3%
1384183 1
 
0.3%
2762167 1
 
0.3%
1605143 1
 
0.3%
1967333 1
 
0.3%
181954 1
 
0.3%
1069751 1
 
0.3%
2341711 1
 
0.3%
828906 1
 
0.3%
478335 1
 
0.3%
Other values (283) 283
96.6%
ValueCountFrequency (%)
580 1
0.3%
697 1
0.3%
1467 1
0.3%
1747 1
0.3%
2105 1
0.3%
2846 1
0.3%
4636 1
0.3%
8488 1
0.3%
20651 1
0.3%
21688 1
0.3%
ValueCountFrequency (%)
12336747 1
0.3%
10696201 1
0.3%
10370938 1
0.3%
9206597 1
0.3%
8244027 1
0.3%
8204445 1
0.3%
8089689 1
0.3%
8076943 1
0.3%
8043723 1
0.3%
7953577 1
0.3%

Interactions

2023-12-12T16:43:00.727715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:43:00.565308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:43:00.807581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:43:00.646695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:43:03.088578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승차하차
승차1.0000.998
하차0.9981.000
2023-12-12T16:43:03.180595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승차하차
승차1.0000.998
하차0.9981.000

Missing values

2023-12-12T16:43:00.943763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:43:01.017245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

역명승차하차
0서울55577723948265
1남영27197722852917
2용산1053684310696201
3노량진55770995446130
4대방37762713891798
5신길22397792016651
6영등포1200397012336747
7신도림52395405196912
8구로56745995805350
9가산디지털단지77133088204445
역명승차하차
283기장1018202988892
284일광1041924995477
285좌천31002846
286월내20862105
287서생15421467
288남창46504636
289망양714697
290덕하17061747
291개운포551580
292태화강2228022444