Overview

Dataset statistics

Number of variables4
Number of observations99
Missing cells1
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory34.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description서울교통공사에서 운영하는 1호선의 역간거리에 대한 데이터로 철도운영기관명, 선명, 역명, 역간거리의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041460/fileData.do

Alerts

선명 has constant value ""Constant
역간거리(km) is highly overall correlated with 철도운영기관명High correlation
철도운영기관명 is highly overall correlated with 역간거리(km)High correlation
철도운영기관명 is highly imbalanced (52.8%)Imbalance
역간거리(km) has 1 (1.0%) missing valuesMissing
역명 has unique valuesUnique
역간거리(km) has 1 (1.0%) zerosZeros

Reproduction

Analysis started2023-12-12 10:37:46.738974
Analysis finished2023-12-12 10:37:47.395848
Duration0.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
코레일
89 
서울교통공사
10 

Length

Max length6
Median length3
Mean length3.3030303
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

Length

2023-12-12T19:37:47.495269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:47.659498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1호선
99 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 99
100.0%

Length

2023-12-12T19:37:47.825710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:47.944548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 99
100.0%

역명
Text

UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
2023-12-12T19:37:48.327365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length2.6464646
Min length2

Characters and Unicode

Total characters262
Distinct characters118
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)100.0%

Sample

1st row지행
2nd row도봉
3rd row창동
4th row월계
5th row외대앞
ValueCountFrequency (%)
지행 1
 
1.0%
간석 1
 
1.0%
오산 1
 
1.0%
오산대 1
 
1.0%
세마 1
 
1.0%
서동탄 1
 
1.0%
병점 1
 
1.0%
수원 1
 
1.0%
화서 1
 
1.0%
의왕 1
 
1.0%
Other values (89) 89
89.9%
2023-12-12T19:37:48.935914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
4.6%
10
 
3.8%
10
 
3.8%
9
 
3.4%
7
 
2.7%
5
 
1.9%
5
 
1.9%
5
 
1.9%
4
 
1.5%
4
 
1.5%
Other values (108) 191
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 254
96.9%
Open Punctuation 3
 
1.1%
Close Punctuation 3
 
1.1%
Decimal Number 2
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Decimal Number
ValueCountFrequency (%)
3 1
50.0%
5 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 254
96.9%
Common 8
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Common
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
3 1
 
12.5%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 254
96.9%
ASCII 8
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
ASCII
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
3 1
 
12.5%
5 1
 
12.5%

역간거리(km)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct37
Distinct (%)37.8%
Missing1
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean2.0306122
Minimum0
Maximum9.4
Zeros1
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1023.0 B
2023-12-12T19:37:49.141030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.8
Q11.2
median1.5
Q32.375
95-th percentile5.13
Maximum9.4
Range9.4
Interquartile range (IQR)1.175

Descriptive statistics

Standard deviation1.4372373
Coefficient of variation (CV)0.70778522
Kurtosis6.877124
Mean2.0306122
Median Absolute Deviation (MAD)0.5
Skewness2.2644066
Sum199
Variance2.0656512
MonotonicityNot monotonic
2023-12-12T19:37:49.296005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
1.0 9
 
9.1%
1.4 8
 
8.1%
1.2 7
 
7.1%
1.6 6
 
6.1%
1.1 6
 
6.1%
1.5 6
 
6.1%
1.3 5
 
5.1%
1.9 4
 
4.0%
1.7 4
 
4.0%
2.2 4
 
4.0%
Other values (27) 39
39.4%
ValueCountFrequency (%)
0.0 1
 
1.0%
0.6 1
 
1.0%
0.7 1
 
1.0%
0.8 4
4.0%
0.9 2
 
2.0%
1.0 9
9.1%
1.1 6
6.1%
1.2 7
7.1%
1.3 5
5.1%
1.4 8
8.1%
ValueCountFrequency (%)
9.4 1
1.0%
5.6 1
1.0%
5.4 2
2.0%
5.3 1
1.0%
5.1 1
1.0%
4.9 1
1.0%
4.8 2
2.0%
4.3 1
1.0%
4.0 1
1.0%
3.8 1
1.0%

Interactions

2023-12-12T19:37:47.063371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:37:49.408557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명역간거리(km)
철도운영기관명1.0001.0000.642
역명1.0001.0001.000
역간거리(km)0.6421.0001.000
2023-12-12T19:37:49.523673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역간거리(km)철도운영기관명
역간거리(km)1.0000.674
철도운영기관명0.6741.000

Missing values

2023-12-12T19:37:47.209840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:37:47.344424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명역간거리(km)
0코레일1호선지행5.6
1코레일1호선도봉1.3
2코레일1호선창동1.0
3코레일1호선월계1.1
4코레일1호선외대앞0.8
5코레일1호선대방1.5
6코레일1호선신길<NA>
7코레일1호선덕정2.9
8코레일1호선구일1.4
9코레일1호선온수1.9
철도운영기관명선명역명역간거리(km)
89서울교통공사1호선서울역1.1
90서울교통공사1호선시청1.0
91서울교통공사1호선종각0.8
92서울교통공사1호선종로3가0.9
93서울교통공사1호선종로5가0.8
94서울교통공사1호선동대문0.6
95서울교통공사1호선신설동0.9
96서울교통공사1호선제기동1.0
97서울교통공사1호선청량리(서울시립대입구)0.0
98서울교통공사1호선동묘앞0.7