Overview

Dataset statistics

Number of variables6
Number of observations406
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.1 KiB
Average record size in memory53.3 B

Variable types

Text1
Categorical4
Numeric1

Dataset

Description역별 타 교통수단과 연계시설현황으로 역 수(KTX, 일반철도, 광역철도, 도시철도), 역 명, 환승주차장 으로 분류됨
Author한국철도공사
URLhttps://www.data.go.kr/data/15090378/fileData.do

Alerts

광역철도역수 is highly imbalanced (71.6%)Imbalance
도시철도역수 is highly imbalanced (88.8%)Imbalance
역명 has unique valuesUnique
환승주차장(면수) has 333 (82.0%) zerosZeros

Reproduction

Analysis started2023-12-12 03:35:18.203984
Analysis finished2023-12-12 03:35:18.841858
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

역명
Text

UNIQUE 

Distinct406
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-12T12:35:19.222292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length2
Mean length2.3546798
Min length2

Characters and Unicode

Total characters956
Distinct characters221
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique406 ?
Unique (%)100.0%

Sample

1st row광명
2nd row김천(구미)
3rd row대전
4th row동대구
5th row부산
ValueCountFrequency (%)
광명 1
 
0.2%
신탄리 1
 
0.2%
임진강 1
 
0.2%
월롱 1
 
0.2%
수색 1
 
0.2%
문산 1
 
0.2%
도라산 1
 
0.2%
한탄강 1
 
0.2%
초성리 1
 
0.2%
전곡 1
 
0.2%
Other values (396) 396
97.5%
2023-12-12T12:35:20.401666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
37
 
3.9%
37
 
3.9%
24
 
2.5%
23
 
2.4%
23
 
2.4%
21
 
2.2%
21
 
2.2%
18
 
1.9%
18
 
1.9%
17
 
1.8%
Other values (211) 717
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 950
99.4%
Open Punctuation 3
 
0.3%
Close Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
3.9%
37
 
3.9%
24
 
2.5%
23
 
2.4%
23
 
2.4%
21
 
2.2%
21
 
2.2%
18
 
1.9%
18
 
1.9%
17
 
1.8%
Other values (209) 711
74.8%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 950
99.4%
Common 6
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
3.9%
37
 
3.9%
24
 
2.5%
23
 
2.4%
23
 
2.4%
21
 
2.2%
21
 
2.2%
18
 
1.9%
18
 
1.9%
17
 
1.8%
Other values (209) 711
74.8%
Common
ValueCountFrequency (%)
( 3
50.0%
) 3
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 950
99.4%
ASCII 6
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
37
 
3.9%
37
 
3.9%
24
 
2.5%
23
 
2.4%
23
 
2.4%
21
 
2.2%
21
 
2.2%
18
 
1.9%
18
 
1.9%
17
 
1.8%
Other values (209) 711
74.8%
ASCII
ValueCountFrequency (%)
( 3
50.0%
) 3
50.0%

KTX역수
Categorical

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
340 
1
66 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 340
83.7%
1 66
 
16.3%

Length

2023-12-12T12:35:20.585734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:35:20.737664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 340
83.7%
1 66
 
16.3%
Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
1
209 
0
197 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 209
51.5%
0 197
48.5%

Length

2023-12-12T12:35:20.910805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:35:21.053222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 209
51.5%
0 197
48.5%

광역철도역수
Categorical

IMBALANCE 

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
361 
1
40 
2
 
4
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 361
88.9%
1 40
 
9.9%
2 4
 
1.0%
3 1
 
0.2%

Length

2023-12-12T12:35:21.199945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:35:21.344941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 361
88.9%
1 40
 
9.9%
2 4
 
1.0%
3 1
 
0.2%

도시철도역수
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
396 
1
 
9
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 396
97.5%
1 9
 
2.2%
2 1
 
0.2%

Length

2023-12-12T12:35:21.500278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:35:21.634728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 396
97.5%
1 9
 
2.2%
2 1
 
0.2%

환승주차장(면수)
Real number (ℝ)

ZEROS 

Distinct64
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.615764
Minimum0
Maximum1886
Zeros333
Zeros (%)82.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-12T12:35:21.823066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile167
Maximum1886
Range1886
Interquartile range (IQR)0

Descriptive statistics

Standard deviation145.35729
Coefficient of variation (CV)3.9698008
Kurtosis78.173219
Mean36.615764
Median Absolute Deviation (MAD)0
Skewness7.7886892
Sum14866
Variance21128.741
MonotonicityNot monotonic
2023-12-12T12:35:21.989695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 333
82.0%
96 3
 
0.7%
102 2
 
0.5%
49 2
 
0.5%
37 2
 
0.5%
100 2
 
0.5%
76 2
 
0.5%
120 2
 
0.5%
62 2
 
0.5%
161 2
 
0.5%
Other values (54) 54
 
13.3%
ValueCountFrequency (%)
0 333
82.0%
31 1
 
0.2%
35 1
 
0.2%
37 2
 
0.5%
44 1
 
0.2%
45 1
 
0.2%
48 1
 
0.2%
49 2
 
0.5%
50 1
 
0.2%
52 1
 
0.2%
ValueCountFrequency (%)
1886 1
0.2%
1070 1
0.2%
984 1
0.2%
779 1
0.2%
696 1
0.2%
688 1
0.2%
508 1
0.2%
437 1
0.2%
414 1
0.2%
390 1
0.2%

Interactions

2023-12-12T12:35:18.526030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:35:22.100047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
KTX역수일반철도역수광역철도역수도시철도역수환승주차장(면수)
KTX역수1.0000.2910.3580.1620.401
일반철도역수0.2911.0000.0000.0370.153
광역철도역수0.3580.0001.0000.4380.000
도시철도역수0.1620.0370.4381.0000.515
환승주차장(면수)0.4010.1530.0000.5151.000
2023-12-12T12:35:22.221256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일반철도역수KTX역수광역철도역수도시철도역수
일반철도역수1.0000.1880.0000.061
KTX역수0.1881.0000.2390.267
광역철도역수0.0000.2391.0000.431
도시철도역수0.0610.2670.4311.000
2023-12-12T12:35:22.353481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환승주차장(면수)KTX역수일반철도역수광역철도역수도시철도역수
환승주차장(면수)1.0000.4270.1620.0000.400
KTX역수0.4271.0000.1880.2390.267
일반철도역수0.1620.1881.0000.0000.061
광역철도역수0.0000.2390.0001.0000.431
도시철도역수0.4000.2670.0610.4311.000

Missing values

2023-12-12T12:35:18.668976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:35:18.797293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

역명KTX역수일반철도역수광역철도역수도시철도역수환승주차장(면수)
0광명1010984
1김천(구미)100096
2대전1101696
3동대구1101414
4부산1101688
5신경주1100357
6오송1100437
7울산1010779
8천안아산10001070
9공주10000
역명KTX역수일반철도역수광역철도역수도시철도역수환승주차장(면수)
396울산항00000
397장생포00000
398장성화물00000
399경화00000
400남창원00000
401성주사00000
402신창원00000
403진해00000
404통해00000
405함백00000