Overview

Dataset statistics

Number of variables12
Number of observations114
Missing cells372
Missing cells (%)27.2%
Duplicate rows5
Duplicate rows (%)4.4%
Total size in memory10.8 KiB
Average record size in memory97.2 B

Variable types

Text2
Categorical1
Unsupported9

Dataset

Description해당 항목에서는 한국철도공사의 동력차(고속, 일반, 전동), 동력차(광역), 객차(일반, 관광) 등 차종별 연도별 폐차 계획 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15088884/fileData.do

Alerts

Dataset has 5 (4.4%) duplicate rowsDuplicates
연도별 차량 보유 및 폐차계획 has 106 (93.0%) missing valuesMissing
Unnamed: 1 has 78 (68.4%) missing valuesMissing
Unnamed: 3 has 9 (7.9%) missing valuesMissing
Unnamed: 4 has 10 (8.8%) missing valuesMissing
Unnamed: 5 has 10 (8.8%) missing valuesMissing
Unnamed: 6 has 10 (8.8%) missing valuesMissing
Unnamed: 7 has 12 (10.5%) missing valuesMissing
Unnamed: 8 has 11 (9.6%) missing valuesMissing
Unnamed: 9 has 11 (9.6%) missing valuesMissing
Unnamed: 10 has 10 (8.8%) missing valuesMissing
* (보유) 해당연도 1월1일 보유량 기준으로 작성 has 105 (92.1%) missing valuesMissing
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
* (보유) 해당연도 1월1일 보유량 기준으로 작성 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 18:01:29.854268
Analysis finished2023-12-12 18:01:30.731660
Duration0.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct8
Distinct (%)100.0%
Missing106
Missing (%)93.0%
Memory size1.0 KiB
2023-12-13T03:01:31.413612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length10
Mean length6.875
Min length2

Characters and Unicode

Total characters55
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st row 구 분
2nd row동력차 (고속, 일반, 전동)
3rd row동력차 (일반)
4th row동력차 (광역)
5th row객차 (일반, 관광)
ValueCountFrequency (%)
동력차 3
17.6%
일반 3
17.6%
1
 
5.9%
1
 
5.9%
고속 1
 
5.9%
전동 1
 
5.9%
광역 1
 
5.9%
객차 1
 
5.9%
관광 1
 
5.9%
화차 1
 
5.9%
Other values (3) 3
17.6%
2023-12-13T03:01:31.803489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
9.1%
5
 
9.1%
5
 
9.1%
( 4
 
7.3%
) 4
 
7.3%
4
 
7.3%
, 3
 
5.5%
3
 
5.5%
3
 
5.5%
3
 
5.5%
Other values (15) 16
29.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33
60.0%
Space Separator 6
 
10.9%
Control 5
 
9.1%
Open Punctuation 4
 
7.3%
Close Punctuation 4
 
7.3%
Other Punctuation 3
 
5.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
15.2%
4
12.1%
3
 
9.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (9) 9
27.3%
Space Separator
ValueCountFrequency (%)
5
83.3%
  1
 
16.7%
Control
ValueCountFrequency (%)
5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33
60.0%
Common 22
40.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
15.2%
4
12.1%
3
 
9.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (9) 9
27.3%
Common
ValueCountFrequency (%)
5
22.7%
5
22.7%
( 4
18.2%
) 4
18.2%
, 3
13.6%
  1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33
60.0%
ASCII 21
38.2%
None 1
 
1.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
23.8%
5
23.8%
( 4
19.0%
) 4
19.0%
, 3
14.3%
Hangul
ValueCountFrequency (%)
5
15.2%
4
12.1%
3
 
9.1%
3
 
9.1%
3
 
9.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (9) 9
27.3%
None
ValueCountFrequency (%)
  1
100.0%

Unnamed: 1
Text

MISSING 

Distinct30
Distinct (%)83.3%
Missing78
Missing (%)68.4%
Memory size1.0 KiB
2023-12-13T03:01:32.064008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length11
Mean length4.9444444
Min length2

Characters and Unicode

Total characters178
Distinct characters73
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)77.8%

Sample

1st rowKTX
2nd rowKTX-산천
3rd rowKTX-산천 (호남)
4th rowKTX-산천 (원강)
5th rowKTX-이음
ValueCountFrequency (%)
6
 
12.5%
6
 
12.5%
ktx-산천 3
 
6.2%
디젤동차 2
 
4.2%
전기동차 2
 
4.2%
기타 2
 
4.2%
호남 1
 
2.1%
무궁화 1
 
2.1%
인버터 1
 
2.1%
디젤기중기 1
 
2.1%
Other values (23) 23
47.9%
2023-12-13T03:01:32.568827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
 
7.9%
13
 
7.3%
- 8
 
4.5%
7
 
3.9%
T 7
 
3.9%
X 7
 
3.9%
6
 
3.4%
6
 
3.4%
) 5
 
2.8%
K 5
 
2.8%
Other values (63) 100
56.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 104
58.4%
Uppercase Letter 31
 
17.4%
Space Separator 13
 
7.3%
Dash Punctuation 8
 
4.5%
Decimal Number 6
 
3.4%
Close Punctuation 5
 
2.8%
Control 5
 
2.8%
Open Punctuation 5
 
2.8%
Other Punctuation 1
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
13.5%
7
 
6.7%
6
 
5.8%
6
 
5.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (43) 51
49.0%
Uppercase Letter
ValueCountFrequency (%)
T 7
22.6%
X 7
22.6%
K 5
16.1%
E 3
9.7%
L 2
 
6.5%
I 2
 
6.5%
U 2
 
6.5%
M 2
 
6.5%
D 1
 
3.2%
Decimal Number
ValueCountFrequency (%)
0 2
33.3%
3 1
16.7%
2 1
16.7%
1 1
16.7%
5 1
16.7%
Space Separator
ValueCountFrequency (%)
13
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Control
ValueCountFrequency (%)
5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 104
58.4%
Common 43
24.2%
Latin 31
 
17.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
13.5%
7
 
6.7%
6
 
5.8%
6
 
5.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (43) 51
49.0%
Common
ValueCountFrequency (%)
13
30.2%
- 8
18.6%
) 5
 
11.6%
5
 
11.6%
( 5
 
11.6%
0 2
 
4.7%
, 1
 
2.3%
3 1
 
2.3%
2 1
 
2.3%
1 1
 
2.3%
Latin
ValueCountFrequency (%)
T 7
22.6%
X 7
22.6%
K 5
16.1%
E 3
9.7%
L 2
 
6.5%
I 2
 
6.5%
U 2
 
6.5%
M 2
 
6.5%
D 1
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 104
58.4%
ASCII 74
41.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
 
13.5%
7
 
6.7%
6
 
5.8%
6
 
5.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (43) 51
49.0%
ASCII
ValueCountFrequency (%)
13
17.6%
- 8
10.8%
T 7
9.5%
X 7
9.5%
) 5
 
6.8%
K 5
 
6.8%
5
 
6.8%
( 5
 
6.8%
E 3
 
4.1%
0 2
 
2.7%
Other values (10) 14
18.9%

Unnamed: 2
Categorical

Distinct5
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
보유
36 
폐차
36 
도입
36 
<NA>
연도말
 
1

Length

Max length4
Median length2
Mean length2.0964912
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
보유 36
31.6%
폐차 36
31.6%
도입 36
31.6%
<NA> 5
 
4.4%
연도말 1
 
0.9%

Length

2023-12-13T03:01:32.771832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:01:32.937009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보유 36
31.6%
폐차 36
31.6%
도입 36
31.6%
na 5
 
4.4%
연도말 1
 
0.9%

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)7.9%
Memory size1.0 KiB

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)8.8%
Memory size1.0 KiB

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)8.8%
Memory size1.0 KiB

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)8.8%
Memory size1.0 KiB

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing12
Missing (%)10.5%
Memory size1.0 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing11
Missing (%)9.6%
Memory size1.0 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing11
Missing (%)9.6%
Memory size1.0 KiB

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)8.8%
Memory size1.0 KiB

* (보유) 해당연도 1월1일 보유량 기준으로 작성
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing105
Missing (%)92.1%
Memory size1.0 KiB

Correlations

2023-12-13T03:01:33.025722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도별 차량 보유 및 폐차계획Unnamed: 1Unnamed: 2
연도별 차량 보유 및 폐차계획1.0001.000NaN
Unnamed: 11.0001.000NaN
Unnamed: 2NaNNaN1.000

Missing values

2023-12-13T03:01:30.103794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:01:30.336878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:01:30.542903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도별 차량 보유 및 폐차계획Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10* (보유) 해당연도 1월1일 보유량 기준으로 작성
0<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaN* (폐차) 해당연도 폐차(예정) 수량
1<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaN* (도입) 계약완료건 반영
2<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaN2023-01-01 00:00:00
3구 분<NA><NA>20212022202320242025202620272028비고
4<NA><NA><NA>실적실적계획계획계획계획계획계획NaN
5동력차 (고속, 일반, 전동)KTX보유920920920920920920920920NaN
6<NA><NA>폐차00000000NaN
7<NA><NA>도입00000000NaN
8<NA>KTX-산천보유240240240240240240240240NaN
9<NA><NA>폐차00000000NaN
연도별 차량 보유 및 폐차계획Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10* (보유) 해당연도 1월1일 보유량 기준으로 작성
104<NA>디젤기중기보유13131212111198NaN
105<NA><NA>폐차01NaN1NaN212NaN
106<NA><NA>도입NaNNaNNaNNaNNaNNaNNaNNaNNaN
107<NA>소 계보유1121031011018989598NaN
108<NA><NA>폐차92012030512NaN
109<NA><NA>도입00000000NaN
110합계전체보유1583715530148711416713946137501280111569NaN
111<NA><NA>폐차4639091589581338101815041122NaN
112<NA><NA>도입15625088536014269272450NaN
113<NA><NA>연도말1553014871141671394613750128011156910897NaN

Duplicate rows

Most frequently occurring

연도별 차량 보유 및 폐차계획Unnamed: 1Unnamed: 2# duplicates
2<NA><NA>도입36
3<NA><NA>폐차36
1<NA>소 계보유6
4<NA><NA><NA>4
0<NA>기타보유2