Overview

Dataset statistics

Number of variables7
Number of observations405
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory23.5 KiB
Average record size in memory59.3 B

Variable types

Categorical3
Text1
Numeric2
Boolean1

Dataset

Description에스컬레이터, 엘리베이터 등 승강설비 설치현황
Author한국철도공사
URLhttps://www.data.go.kr/data/15042135/fileData.do

Alerts

운영기관 has constant value ""Constant
엘리베이터 is highly overall correlated with 에스컬레이터High correlation
에스컬레이터 is highly overall correlated with 엘리베이터High correlation
휠체어리프트 is highly imbalanced (81.1%)Imbalance
역명 has unique valuesUnique
엘리베이터 has 300 (74.1%) zerosZeros
에스컬레이터 has 331 (81.7%) zerosZeros

Reproduction

Analysis started2023-12-12 12:46:55.256736
Analysis finished2023-12-12 12:46:56.159975
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

운영기관
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
한국철도공사
405 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한국철도공사
2nd row한국철도공사
3rd row한국철도공사
4th row한국철도공사
5th row한국철도공사

Common Values

ValueCountFrequency (%)
한국철도공사 405
100.0%

Length

2023-12-12T21:46:56.213264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:46:56.297300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한국철도공사 405
100.0%

노선
Categorical

Distinct45
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
경부선
60 
중앙선
59 
호남선
36 
경전선
33 
영동선
30 
Other values (40)
187 

Length

Max length7
Median length3
Mean length3.1234568
Min length3

Unique

Unique19 ?
Unique (%)4.7%

Sample

1st row가야선
2nd row강경선
3rd row호남고속선
4th row경부고속선
5th row경부고속선

Common Values

ValueCountFrequency (%)
경부선 60
14.8%
중앙선 59
14.6%
호남선 36
8.9%
경전선 33
 
8.1%
영동선 30
 
7.4%
전라선 27
 
6.7%
동해선 23
 
5.7%
장항선 22
 
5.4%
태백선 17
 
4.2%
충북선 15
 
3.7%
Other values (35) 83
20.5%

Length

2023-12-12T21:46:56.409464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경부선 60
14.8%
중앙선 59
14.6%
호남선 36
8.9%
경전선 33
 
8.1%
영동선 30
 
7.4%
전라선 27
 
6.7%
동해선 23
 
5.7%
장항선 22
 
5.4%
태백선 17
 
4.2%
충북선 15
 
3.7%
Other values (35) 83
20.5%

역명
Text

UNIQUE 

Distinct405
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-12T21:46:56.777881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.2740741
Min length2

Characters and Unicode

Total characters921
Distinct characters220
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique405 ?
Unique (%)100.0%

Sample

1st row가야
2nd row연무대
3rd row공주
4th row광명
5th row김천(구미)
ValueCountFrequency (%)
가야 1
 
0.2%
여수엑스포 1
 
0.2%
단촌 1
 
0.2%
단양 1
 
0.2%
단성 1
 
0.2%
구학 1
 
0.2%
고명 1
 
0.2%
경주 1
 
0.2%
건천 1
 
0.2%
갑현 1
 
0.2%
Other values (395) 395
97.5%
2023-12-12T21:46:57.293525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36
 
3.9%
30
 
3.3%
28
 
3.0%
23
 
2.5%
21
 
2.3%
19
 
2.1%
19
 
2.1%
17
 
1.8%
16
 
1.7%
16
 
1.7%
Other values (210) 696
75.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 915
99.3%
Open Punctuation 3
 
0.3%
Close Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
3.9%
30
 
3.3%
28
 
3.1%
23
 
2.5%
21
 
2.3%
19
 
2.1%
19
 
2.1%
17
 
1.9%
16
 
1.7%
16
 
1.7%
Other values (208) 690
75.4%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 915
99.3%
Common 6
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
3.9%
30
 
3.3%
28
 
3.1%
23
 
2.5%
21
 
2.3%
19
 
2.1%
19
 
2.1%
17
 
1.9%
16
 
1.7%
16
 
1.7%
Other values (208) 690
75.4%
Common
ValueCountFrequency (%)
( 3
50.0%
) 3
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 915
99.3%
ASCII 6
 
0.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
36
 
3.9%
30
 
3.3%
28
 
3.1%
23
 
2.5%
21
 
2.3%
19
 
2.1%
19
 
2.1%
17
 
1.9%
16
 
1.7%
16
 
1.7%
Other values (208) 690
75.4%
ASCII
ValueCountFrequency (%)
( 3
50.0%
) 3
50.0%

엘리베이터
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0691358
Minimum0
Maximum17
Zeros300
Zeros (%)74.1%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-12T21:46:57.443660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile5
Maximum17
Range17
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.4179937
Coefficient of variation (CV)2.2616338
Kurtosis14.895951
Mean1.0691358
Median Absolute Deviation (MAD)0
Skewness3.4733561
Sum433
Variance5.8466936
MonotonicityNot monotonic
2023-12-12T21:46:57.559442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 300
74.1%
3 33
 
8.1%
2 32
 
7.9%
4 15
 
3.7%
6 6
 
1.5%
5 5
 
1.2%
10 3
 
0.7%
7 3
 
0.7%
14 2
 
0.5%
12 1
 
0.2%
Other values (5) 5
 
1.2%
ValueCountFrequency (%)
0 300
74.1%
1 1
 
0.2%
2 32
 
7.9%
3 33
 
8.1%
4 15
 
3.7%
5 5
 
1.2%
6 6
 
1.5%
7 3
 
0.7%
10 3
 
0.7%
11 1
 
0.2%
ValueCountFrequency (%)
17 1
 
0.2%
16 1
 
0.2%
14 2
 
0.5%
13 1
 
0.2%
12 1
 
0.2%
11 1
 
0.2%
10 3
0.7%
7 3
0.7%
6 6
1.5%
5 5
1.2%

에스컬레이터
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct21
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.545679
Minimum0
Maximum28
Zeros331
Zeros (%)81.7%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-12T21:46:57.694516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile8
Maximum28
Range28
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.3669565
Coefficient of variation (CV)2.8252674
Kurtosis16.537443
Mean1.545679
Median Absolute Deviation (MAD)0
Skewness3.8531656
Sum626
Variance19.070309
MonotonicityNot monotonic
2023-12-12T21:46:57.812895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 331
81.7%
4 23
 
5.7%
8 10
 
2.5%
6 10
 
2.5%
12 4
 
1.0%
1 3
 
0.7%
10 3
 
0.7%
7 3
 
0.7%
14 2
 
0.5%
28 2
 
0.5%
Other values (11) 14
 
3.5%
ValueCountFrequency (%)
0 331
81.7%
1 3
 
0.7%
2 2
 
0.5%
3 2
 
0.5%
4 23
 
5.7%
5 1
 
0.2%
6 10
 
2.5%
7 3
 
0.7%
8 10
 
2.5%
10 3
 
0.7%
ValueCountFrequency (%)
28 2
0.5%
27 1
0.2%
26 1
0.2%
24 1
0.2%
23 1
0.2%
22 1
0.2%
20 1
0.2%
18 2
0.5%
15 1
0.2%
14 2
0.5%

휠체어리프트
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
380 
3
 
11
2
 
8
1
 
5
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 380
93.8%
3 11
 
2.7%
2 8
 
2.0%
1 5
 
1.2%
4 1
 
0.2%

Length

2023-12-12T21:46:57.929103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:46:58.042473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 380
93.8%
3 11
 
2.7%
2 8
 
2.0%
1 5
 
1.2%
4 1
 
0.2%
Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size537.0 B
False
267 
True
138 
ValueCountFrequency (%)
False 267
65.9%
True 138
34.1%
2023-12-12T21:46:58.150725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T21:46:55.818191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:46:55.600690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:46:55.918365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:46:55.719797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:46:58.261790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선엘리베이터에스컬레이터휠체어리프트장애인경사로
노선1.0000.0000.1460.0000.164
엘리베이터0.0001.0000.9460.3190.000
에스컬레이터0.1460.9461.0000.3460.000
휠체어리프트0.0000.3190.3461.0000.069
장애인경사로0.1640.0000.0000.0691.000
2023-12-12T21:46:58.376942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
휠체어리프트장애인경사로노선
휠체어리프트1.0000.0850.000
장애인경사로0.0851.0000.128
노선0.0000.1281.000
2023-12-12T21:46:58.495148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
엘리베이터에스컬레이터노선휠체어리프트장애인경사로
엘리베이터1.0000.8280.0000.1370.000
에스컬레이터0.8281.0000.1180.1490.000
노선0.0000.1181.0000.0000.128
휠체어리프트0.1370.1490.0001.0000.085
장애인경사로0.0000.0000.1280.0851.000

Missing values

2023-12-12T21:46:56.021747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:46:56.119558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

운영기관노선역명엘리베이터에스컬레이터휠체어리프트장애인경사로
0한국철도공사가야선가야000N
1한국철도공사강경선연무대000N
2한국철도공사호남고속선공주260N
3한국철도공사경부고속선광명12270Y
4한국철도공사경부고속선김천(구미)2101Y
5한국철도공사경부고속선신경주560Y
6한국철도공사경부고속선오송16200N
7한국철도공사경부고속선울산3100N
8한국철도공사경부고속선천안아산2140N
9한국철도공사경부선가천000N
운영기관노선역명엘리베이터에스컬레이터휠체어리프트장애인경사로
395한국철도공사호남선임성리003Y
396한국철도공사호남선장성300N
397한국철도공사호남고속선정읍6120N
398한국철도공사호남선채운000N
399한국철도공사호남선천원000N
400한국철도공사호남선하남000N
401한국철도공사호남선함열000N
402한국철도공사호남선함평002N
403한국철도공사호남선황등000N
404한국철도공사호남선흑석리000N