Overview

Dataset statistics

Number of variables5
Number of observations69
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory41.9 B

Variable types

Categorical4
Text1

Dataset

Description한국철도에서 운영하는 KTX역별 최초 개통일 데이터 입니다. 노선별 역별 운행 열차 종류와 KTX 개통일을 알 수 있습니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15126723/fileData.do

Alerts

선명 is highly overall correlated with 간선열차 and 1 other fieldsHigh correlation
간선열차 is highly overall correlated with 선명 and 1 other fieldsHigh correlation
개통일 is highly overall correlated with 선명 and 1 other fieldsHigh correlation
역명 has unique valuesUnique

Reproduction

Analysis started2024-03-14 08:48:50.487122
Analysis finished2024-03-14 08:48:51.324152
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

선명
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Memory size680.0 B
경부고속
경부선
호남선
전라선
중앙선(청량리-도담)
Other values (11)
30 

Length

Max length11
Median length3
Mean length5.0289855
Min length3

Unique

Unique5 ?
Unique (%)7.2%

Sample

1st row경부고속
2nd row경부고속
3rd row경부고속
4th row경부고속
5th row경부고속

Common Values

ValueCountFrequency (%)
경부고속 9
13.0%
경부선 9
13.0%
호남선 7
10.1%
전라선 7
10.1%
중앙선(청량리-도담) 7
10.1%
경강선(강릉선) 6
8.7%
경전선 5
7.2%
호남고속 4
5.8%
중앙선(도담-경주) 4
5.8%
중부내륙선 3
 
4.3%
Other values (6) 8
11.6%

Length

2024-03-14T17:48:51.770619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경부고속 9
13.0%
경부선 9
13.0%
호남선 7
10.1%
전라선 7
10.1%
중앙선(청량리-도담 7
10.1%
경강선(강릉선 6
8.7%
경전선 5
7.2%
호남고속 4
5.8%
중앙선(도담-경주 4
5.8%
중부내륙선 3
 
4.3%
Other values (6) 8
11.6%

역명
Text

UNIQUE 

Distinct69
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size680.0 B
2024-03-14T17:48:52.832671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.4782609
Min length2

Characters and Unicode

Total characters171
Distinct characters87
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)100.0%

Sample

1st row광명
2nd row천안아산
3rd row오송
4th row대전
5th row김천(구미)
ValueCountFrequency (%)
광명 1
 
1.4%
창원중앙 1
 
1.4%
덕소 1
 
1.4%
상봉 1
 
1.4%
청량리 1
 
1.4%
포항 1
 
1.4%
여수엑스포 1
 
1.4%
여천 1
 
1.4%
구례구 1
 
1.4%
순천 1
 
1.4%
Other values (59) 59
85.5%
2024-03-14T17:48:54.281638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
5.8%
9
 
5.3%
7
 
4.1%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
4
 
2.3%
4
 
2.3%
4
 
2.3%
Other values (77) 111
64.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 167
97.7%
Close Punctuation 2
 
1.2%
Open Punctuation 2
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
6.0%
9
 
5.4%
7
 
4.2%
6
 
3.6%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (75) 107
64.1%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 167
97.7%
Common 4
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
6.0%
9
 
5.4%
7
 
4.2%
6
 
3.6%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (75) 107
64.1%
Common
ValueCountFrequency (%)
) 2
50.0%
( 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 167
97.7%
ASCII 4
 
2.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
6.0%
9
 
5.4%
7
 
4.2%
6
 
3.6%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (75) 107
64.1%
ASCII
ValueCountFrequency (%)
) 2
50.0%
( 2
50.0%

간선열차
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Memory size680.0 B
KTX, ITX새마을, 무궁화
28 
KTX
18 
KTX, 무궁화, 누리로
10 
KTX, 무궁화
KTX, ITX새마을, 새마을, 무궁화
Other values (3)

Length

Max length22
Median length21
Mean length11.695652
Min length3

Unique

Unique2 ?
Unique (%)2.9%

Sample

1st rowKTX
2nd rowKTX
3rd rowKTX, 무궁화
4th rowKTX, ITX새마을, 무궁화
5th rowKTX

Common Values

ValueCountFrequency (%)
KTX, ITX새마을, 무궁화 28
40.6%
KTX 18
26.1%
KTX, 무궁화, 누리로 10
 
14.5%
KTX, 무궁화 4
 
5.8%
KTX, ITX새마을, 새마을, 무궁화 4
 
5.8%
KTX, 누리로 3
 
4.3%
KTX, ITX새마을, 무궁화, 통근열차 1
 
1.4%
ktx, 무궁화, 누리로 1
 
1.4%

Length

2024-03-14T17:48:54.727123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T17:48:55.097947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
ktx 69
40.8%
무궁화 48
28.4%
itx새마을 33
19.5%
누리로 14
 
8.3%
새마을 4
 
2.4%
통근열차 1
 
0.6%

광역열차
Categorical

Distinct3
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size680.0 B
해당없음
57 
광역전철
ITX청춘, 광역전철
 
3

Length

Max length11
Median length4
Mean length4.3043478
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광역전철
2nd row해당없음
3rd row해당없음
4th row해당없음
5th row해당없음

Common Values

ValueCountFrequency (%)
해당없음 57
82.6%
광역전철 9
 
13.0%
ITX청춘, 광역전철 3
 
4.3%

Length

2024-03-14T17:48:55.537112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T17:48:55.874310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
해당없음 57
79.2%
광역전철 12
 
16.7%
itx청춘 3
 
4.2%

개통일
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Memory size680.0 B
2004-04-01
19 
2017-12-22
2011-10-05
2021-01-05
2010-11-01
Other values (10)
21 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique6 ?
Unique (%)8.7%

Sample

1st row2004-04-01
2nd row2004-04-01
3rd row2010-11-01
4th row2004-04-01
5th row2010-11-01

Common Values

ValueCountFrequency (%)
2004-04-01 19
27.5%
2017-12-22 9
13.0%
2011-10-05 7
 
10.1%
2021-01-05 7
 
10.1%
2010-11-01 6
 
8.7%
2021-12-31 5
 
7.2%
2010-12-15 5
 
7.2%
2020-03-02 3
 
4.3%
2015-04-02 2
 
2.9%
2022-03-31 1
 
1.4%
Other values (5) 5
 
7.2%

Length

2024-03-14T17:48:56.234655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2004-04-01 19
27.5%
2017-12-22 9
13.0%
2011-10-05 7
 
10.1%
2021-01-05 7
 
10.1%
2010-11-01 6
 
8.7%
2021-12-31 5
 
7.2%
2010-12-15 5
 
7.2%
2020-03-02 3
 
4.3%
2015-04-02 2
 
2.9%
2022-03-31 1
 
1.4%
Other values (5) 5
 
7.2%

Correlations

2024-03-14T17:48:56.466294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명역명간선열차광역열차개통일
선명1.0001.0000.8900.5750.932
역명1.0001.0001.0001.0001.000
간선열차0.8901.0001.0000.4090.853
광역열차0.5751.0000.4091.0000.213
개통일0.9321.0000.8530.2131.000
2024-03-14T17:48:56.732526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명광역열차개통일간선열차
선명1.0000.3380.6550.501
광역열차0.3381.0000.0700.272
개통일0.6550.0701.0000.548
간선열차0.5010.2720.5481.000
2024-03-14T17:48:56.988356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선명간선열차광역열차개통일
선명1.0000.5010.3380.655
간선열차0.5011.0000.2720.548
광역열차0.3380.2721.0000.070
개통일0.6550.5480.0701.000

Missing values

2024-03-14T17:48:50.891236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T17:48:51.203238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

선명역명간선열차광역열차개통일
0경부고속광명KTX광역전철2004-04-01
1경부고속천안아산KTX해당없음2004-04-01
2경부고속오송KTX, 무궁화해당없음2010-11-01
3경부고속대전KTX, ITX새마을, 무궁화해당없음2004-04-01
4경부고속김천(구미)KTX해당없음2010-11-01
5경부고속동대구KTX, ITX새마을, 무궁화해당없음2004-04-01
6경부고속신경주KTX, 무궁화해당없음2010-11-01
7경부고속울산KTX해당없음2010-11-01
8경부고속부산KTX, ITX새마을, 무궁화해당없음2004-04-01
9호남고속공주KTX해당없음2015-04-02
선명역명간선열차광역열차개통일
59영동선동해KTX, 무궁화, 누리로해당없음2020-03-02
60영동선묵호KTX, 누리로해당없음2020-03-02
61영동선정동진KTX, 누리로해당없음2020-03-02
62경강선부발KTX광역전철2021-12-31
63경강선(강릉선)만종KTX해당없음2017-12-22
64경강선(강릉선)횡성KTX해당없음2017-12-22
65경강선(강릉선)둔내KTX해당없음2017-12-22
66경강선(강릉선)평창KTX해당없음2017-12-22
67경강선(강릉선)진부(오대산)KTX해당없음2017-12-22
68경강선(강릉선)강릉KTX, 누리로해당없음2017-12-22