Overview

Dataset statistics

Number of variables7
Number of observations39
Missing cells4
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory63.4 B

Variable types

Numeric4
Categorical1
Text1
DateTime1

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13317/F/1/datasetView.do

Alerts

연번 is highly overall correlated with 호선 and 1 other fieldsHigh correlation
호선 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
역개수 is highly overall correlated with 연장(km)High correlation
연장(km) is highly overall correlated with 역개수High correlation
기관 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
역개수 has 1 (2.6%) missing valuesMissing
연장(km) has 3 (7.7%) missing valuesMissing
연번 has unique valuesUnique
구간 has unique valuesUnique

Reproduction

Analysis started2023-12-11 06:13:18.126769
Analysis finished2023-12-11 06:13:20.546797
Duration2.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-11T15:13:20.651585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.9
Q110.5
median20
Q329.5
95-th percentile37.1
Maximum39
Range38
Interquartile range (IQR)19

Descriptive statistics

Standard deviation11.401754
Coefficient of variation (CV)0.57008771
Kurtosis-1.2
Mean20
Median Absolute Deviation (MAD)10
Skewness0
Sum780
Variance130
MonotonicityStrictly increasing
2023-12-11T15:13:20.868172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1 1
 
2.6%
2 1
 
2.6%
23 1
 
2.6%
24 1
 
2.6%
25 1
 
2.6%
26 1
 
2.6%
27 1
 
2.6%
28 1
 
2.6%
29 1
 
2.6%
30 1
 
2.6%
Other values (29) 29
74.4%
ValueCountFrequency (%)
1 1
2.6%
2 1
2.6%
3 1
2.6%
4 1
2.6%
5 1
2.6%
6 1
2.6%
7 1
2.6%
8 1
2.6%
9 1
2.6%
10 1
2.6%
ValueCountFrequency (%)
39 1
2.6%
38 1
2.6%
37 1
2.6%
36 1
2.6%
35 1
2.6%
34 1
2.6%
33 1
2.6%
32 1
2.6%
31 1
2.6%
30 1
2.6%

기관
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size444.0 B
서울메트로
20 
도시철도공사
14 
서울교통공사
서울메트로9호선운영
 
1

Length

Max length10
Median length5
Mean length5.5897436
Min length5

Unique

Unique1 ?
Unique (%)2.6%

Sample

1st row서울메트로
2nd row서울메트로
3rd row서울메트로
4th row서울메트로
5th row서울메트로

Common Values

ValueCountFrequency (%)
서울메트로 20
51.3%
도시철도공사 14
35.9%
서울교통공사 4
 
10.3%
서울메트로9호선운영 1
 
2.6%

Length

2023-12-11T15:13:21.354601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:13:21.488901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울메트로 20
51.3%
도시철도공사 14
35.9%
서울교통공사 4
 
10.3%
서울메트로9호선운영 1
 
2.6%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4102564
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-11T15:13:21.596552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.9
Q12
median4
Q36
95-th percentile8.1
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.2561955
Coefficient of variation (CV)0.51157922
Kurtosis-0.84954215
Mean4.4102564
Median Absolute Deviation (MAD)2
Skewness0.3700827
Sum172
Variance5.0904184
MonotonicityIncreasing
2023-12-11T15:13:21.735070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 9
23.1%
5 7
17.9%
3 5
12.8%
4 4
10.3%
6 4
10.3%
7 4
10.3%
1 2
 
5.1%
8 2
 
5.1%
9 2
 
5.1%
ValueCountFrequency (%)
1 2
 
5.1%
2 9
23.1%
3 5
12.8%
4 4
10.3%
5 7
17.9%
6 4
10.3%
7 4
10.3%
8 2
 
5.1%
9 2
 
5.1%
ValueCountFrequency (%)
9 2
 
5.1%
8 2
 
5.1%
7 4
10.3%
6 4
10.3%
5 7
17.9%
4 4
10.3%
3 5
12.8%
2 9
23.1%
1 2
 
5.1%

구간
Text

UNIQUE 

Distinct39
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size444.0 B
2023-12-11T15:13:21.995774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length7.2307692
Min length2

Characters and Unicode

Total characters282
Distinct characters97
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)100.0%

Sample

1st row서울↔청량리
2nd row동묘앞
3rd row신설동(2)↔종합운동장
4th row종합운동장↔교대(2)
5th row을지입구↔성수
ValueCountFrequency (%)
서울↔청량리 1
 
2.6%
왕십리↔상일동 1
 
2.6%
강동↔마천 1
 
2.6%
까치산↔여의도 1
 
2.6%
여의도↔왕십리 1
 
2.6%
미사↔하남풍산 1
 
2.6%
강일↔하남검단산 1
 
2.6%
봉화산↔상월곡 1
 
2.6%
응암↔상월곡 1
 
2.6%
방화↔까치산 1
 
2.6%
Other values (29) 29
74.4%
2023-12-11T15:13:22.483316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
37
 
13.1%
13
 
4.6%
8
 
2.8%
8
 
2.8%
8
 
2.8%
8
 
2.8%
7
 
2.5%
6
 
2.1%
5
 
1.8%
5
 
1.8%
Other values (87) 177
62.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 231
81.9%
Math Symbol 37
 
13.1%
Open Punctuation 5
 
1.8%
Close Punctuation 5
 
1.8%
Decimal Number 4
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
5.6%
8
 
3.5%
8
 
3.5%
8
 
3.5%
8
 
3.5%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (82) 158
68.4%
Decimal Number
ValueCountFrequency (%)
2 3
75.0%
9 1
 
25.0%
Math Symbol
ValueCountFrequency (%)
37
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 231
81.9%
Common 51
 
18.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
5.6%
8
 
3.5%
8
 
3.5%
8
 
3.5%
8
 
3.5%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (82) 158
68.4%
Common
ValueCountFrequency (%)
37
72.5%
( 5
 
9.8%
) 5
 
9.8%
2 3
 
5.9%
9 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 231
81.9%
Arrows 37
 
13.1%
ASCII 14
 
5.0%

Most frequent character per block

Arrows
ValueCountFrequency (%)
37
100.0%
Hangul
ValueCountFrequency (%)
13
 
5.6%
8
 
3.5%
8
 
3.5%
8
 
3.5%
8
 
3.5%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (82) 158
68.4%
ASCII
ValueCountFrequency (%)
( 5
35.7%
) 5
35.7%
2 3
21.4%
9 1
 
7.1%

역개수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct17
Distinct (%)44.7%
Missing1
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean7.7894737
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-11T15:13:22.644678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q310.75
95-th percentile16.45
Maximum28
Range27
Interquartile range (IQR)7.75

Descriptive statistics

Standard deviation6.0767963
Coefficient of variation (CV)0.78012926
Kurtosis1.8177581
Mean7.7894737
Median Absolute Deviation (MAD)4
Skewness1.1415564
Sum296
Variance36.927454
MonotonicityNot monotonic
2023-12-11T15:13:22.787053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 7
17.9%
9 4
10.3%
5 3
 
7.7%
8 3
 
7.7%
7 3
 
7.7%
4 2
 
5.1%
13 2
 
5.1%
14 2
 
5.1%
3 2
 
5.1%
2 2
 
5.1%
Other values (7) 8
20.5%
ValueCountFrequency (%)
1 7
17.9%
2 2
 
5.1%
3 2
 
5.1%
4 2
 
5.1%
5 3
7.7%
6 1
 
2.6%
7 3
7.7%
8 3
7.7%
9 4
10.3%
10 1
 
2.6%
ValueCountFrequency (%)
28 1
 
2.6%
19 1
 
2.6%
16 2
5.1%
15 1
 
2.6%
14 2
5.1%
13 2
5.1%
11 1
 
2.6%
10 1
 
2.6%
9 4
10.3%
8 3
7.7%

연장(km)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct34
Distinct (%)94.4%
Missing3
Missing (%)7.7%
Infinite0
Infinite (%)0.0%
Mean8.9583333
Minimum1.2
Maximum30.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-11T15:13:22.952255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.2
5-th percentile1.375
Q13.9
median7.85
Q313.325
95-th percentile19.2
Maximum30.9
Range29.7
Interquartile range (IQR)9.425

Descriptive statistics

Standard deviation6.7475445
Coefficient of variation (CV)0.75321427
Kurtosis1.6653815
Mean8.9583333
Median Absolute Deviation (MAD)4.9
Skewness1.181629
Sum322.5
Variance45.529357
MonotonicityNot monotonic
2023-12-11T15:13:23.104903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
7.9 2
 
5.1%
4.6 2
 
5.1%
1.3 1
 
2.6%
7.1 1
 
2.6%
14.0 1
 
2.6%
2.9 1
 
2.6%
4.2 1
 
2.6%
30.9 1
 
2.6%
19.0 1
 
2.6%
14.4 1
 
2.6%
Other values (24) 24
61.5%
(Missing) 3
 
7.7%
ValueCountFrequency (%)
1.2 1
2.6%
1.3 1
2.6%
1.4 1
2.6%
1.5 1
2.6%
1.9 1
2.6%
2.2 1
2.6%
2.7 1
2.6%
2.9 1
2.6%
3.0 1
2.6%
4.2 1
2.6%
ValueCountFrequency (%)
30.9 1
2.6%
19.8 1
2.6%
19.0 1
2.6%
18.7 1
2.6%
18.2 1
2.6%
16.5 1
2.6%
14.4 1
2.6%
14.3 1
2.6%
14.0 1
2.6%
13.1 1
2.6%
Distinct37
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Memory size444.0 B
Minimum1974-08-15 00:00:00
Maximum2021-03-27 00:00:00
2023-12-11T15:13:23.273416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:23.443476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)

Interactions

2023-12-11T15:13:19.798837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:18.418555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:18.892431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.369738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.902395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:18.539117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.036658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.492667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:20.012994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:18.670578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.149843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.615123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:20.107462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:18.762489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.244860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:13:19.706445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:13:23.543328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번기관호선구간역개수연장(km)개통일자
연번1.0000.7240.9441.0000.0000.0000.827
기관0.7241.0000.8041.0000.3650.0000.967
호선0.9440.8041.0001.0000.0000.0000.920
구간1.0001.0001.0001.0001.0001.0001.000
역개수0.0000.3650.0001.0001.0000.9720.794
연장(km)0.0000.0000.0001.0000.9721.0000.983
개통일자0.8270.9670.9201.0000.7940.9831.000
2023-12-11T15:13:23.658355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선역개수연장(km)기관
연번1.0000.9880.1520.1330.510
호선0.9881.0000.2320.2100.614
역개수0.1520.2321.0000.9750.142
연장(km)0.1330.2100.9751.0000.000
기관0.5100.6140.1420.0001.000

Missing values

2023-12-11T15:13:20.265227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:13:20.383027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T15:13:20.490536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번기관호선구간역개수연장(km)개통일자
01서울메트로1서울↔청량리97.81974-08-15
12서울메트로1동묘앞1<NA>2005-12-21
23서울메트로2신설동(2)↔종합운동장1114.31980-10-31
34서울메트로2종합운동장↔교대(2)55.51982-12-23
45서울메트로2을지입구↔성수97.91983-09-16
56서울메트로2교대(2)↔서울대입구56.71983-12-17
67서울메트로2서울대입구↔을지입구1619.81984-05-22
78서울메트로2신도림↔양천구청22.71992-05-22
89서울메트로2양천구청↔신정네거리11.91996-02-29
910서울메트로2신정네거리↔까치산<NA>1.41996-03-20
연번기관호선구간역개수연장(km)개통일자
2930도시철도공사6이태원↔약수4<NA>2001-03-09
3031서울교통공사6봉화산↔신내11.32019-12-21
3132도시철도공사7장암↔건대입구1919.01996-10-11
3233도시철도공사7온수↔신풍89.22000-02-29
3334도시철도공사7온수↔부평구청910.22012-10-27
3435도시철도공사7건대입구↔신풍1518.72000-08-01
3536도시철도공사8잠실↔모란1313.11996-11-23
3637도시철도공사8암사↔잠실44.61999-07-02
3738서울메트로9호선운영9신논현↔종합운동장54.52015-03-28
3839서울교통공사9종합운동장(9)↔중앙보훈병원89.12018-12-01