Overview

Dataset statistics

Number of variables6
Number of observations37
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory52.6 B

Variable types

Categorical1
Numeric1
Text4

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13317/F/1/datasetView.do

Alerts

호선 is highly overall correlated with 회사High correlation
회사 is highly overall correlated with 호선High correlation
has unique valuesUnique

Reproduction

Analysis started2023-12-11 06:13:14.485878
Analysis finished2023-12-11 06:13:15.010629
Duration0.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회사
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size428.0 B
서울메트로
20 
도시철도공사
14 
메트로9호선

Length

Max length6
Median length5
Mean length5.4594595
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울메트로
2nd row서울메트로
3rd row서울메트로
4th row서울메트로
5th row서울메트로

Common Values

ValueCountFrequency (%)
서울메트로 20
54.1%
도시철도공사 14
37.8%
메트로9호선 3
 
8.1%

Length

2023-12-11T15:13:15.064576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:13:15.144776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울메트로 20
54.1%
도시철도공사 14
37.8%
메트로9호선 3
 
8.1%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4594595
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size465.0 B
2023-12-11T15:13:15.219651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.8
Q12
median4
Q36
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.4220583
Coefficient of variation (CV)0.54312822
Kurtosis-0.96699417
Mean4.4594595
Median Absolute Deviation (MAD)2
Skewness0.44291989
Sum165
Variance5.8663664
MonotonicityIncreasing
2023-12-11T15:13:15.310160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 9
24.3%
3 5
13.5%
5 5
13.5%
4 4
10.8%
7 4
10.8%
6 3
 
8.1%
9 3
 
8.1%
1 2
 
5.4%
8 2
 
5.4%
ValueCountFrequency (%)
1 2
 
5.4%
2 9
24.3%
3 5
13.5%
4 4
10.8%
5 5
13.5%
6 3
 
8.1%
7 4
10.8%
8 2
 
5.4%
9 3
 
8.1%
ValueCountFrequency (%)
9 3
 
8.1%
8 2
 
5.4%
7 4
10.8%
6 3
 
8.1%
5 5
13.5%
4 4
10.8%
3 5
13.5%
2 9
24.3%
1 2
 
5.4%


Text

UNIQUE 

Distinct37
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:15.495970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length6.9459459
Min length2

Characters and Unicode

Total characters257
Distinct characters89
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st row서울↔청량리
2nd row동묘앞
3rd row신설동(2)↔종합운동장
4th row종합운동장↔교대(2)
5th row을지입구↔성수
ValueCountFrequency (%)
서울↔청량리 1
 
2.7%
사당↔남태령(시계 1
 
2.7%
방화↔까치산 1
 
2.7%
강동↔마천 1
 
2.7%
까치산↔여의도 1
 
2.7%
여의도↔왕십리 1
 
2.7%
봉화산↔상월곡 1
 
2.7%
응암↔상월곡 1
 
2.7%
이태원↔약수 1
 
2.7%
장암↔건대입구 1
 
2.7%
Other values (27) 27
73.0%
2023-12-11T15:13:15.781420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
 
13.2%
13
 
5.1%
8
 
3.1%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (79) 158
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 212
82.5%
Math Symbol 34
 
13.2%
Close Punctuation 4
 
1.6%
Open Punctuation 4
 
1.6%
Decimal Number 3
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
6.1%
8
 
3.8%
8
 
3.8%
8
 
3.8%
7
 
3.3%
6
 
2.8%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (75) 143
67.5%
Math Symbol
ValueCountFrequency (%)
34
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Decimal Number
ValueCountFrequency (%)
2 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 212
82.5%
Common 45
 
17.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
6.1%
8
 
3.8%
8
 
3.8%
8
 
3.8%
7
 
3.3%
6
 
2.8%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (75) 143
67.5%
Common
ValueCountFrequency (%)
34
75.6%
) 4
 
8.9%
( 4
 
8.9%
2 3
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 212
82.5%
Arrows 34
 
13.2%
ASCII 11
 
4.3%

Most frequent character per block

Arrows
ValueCountFrequency (%)
34
100.0%
Hangul
ValueCountFrequency (%)
13
 
6.1%
8
 
3.8%
8
 
3.8%
8
 
3.8%
7
 
3.3%
6
 
2.8%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (75) 143
67.5%
ASCII
ValueCountFrequency (%)
) 4
36.4%
( 4
36.4%
2 3
27.3%
Distinct19
Distinct (%)51.4%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:15.903295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.3243243
Min length1

Characters and Unicode

Total characters49
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)27.0%

Sample

1st row9
2nd row1
3rd row11
4th row5
5th row9
ValueCountFrequency (%)
1 7
18.9%
9 4
10.8%
5 3
 
8.1%
7 3
 
8.1%
16 2
 
5.4%
4 2
 
5.4%
14 2
 
5.4%
8 2
 
5.4%
13 2
 
5.4%
10 1
 
2.7%
Other values (9) 9
24.3%
2023-12-11T15:13:16.199780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 18
36.7%
9 5
 
10.2%
4 5
 
10.2%
5 4
 
8.2%
7 3
 
6.1%
6 3
 
6.1%
8 3
 
6.1%
3 3
 
6.1%
2 3
 
6.1%
- 1
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 48
98.0%
Dash Punctuation 1
 
2.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 18
37.5%
9 5
 
10.4%
4 5
 
10.4%
5 4
 
8.3%
7 3
 
6.2%
6 3
 
6.2%
8 3
 
6.2%
3 3
 
6.2%
2 3
 
6.2%
0 1
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 18
36.7%
9 5
 
10.2%
4 5
 
10.2%
5 4
 
8.2%
7 3
 
6.1%
6 3
 
6.1%
8 3
 
6.1%
3 3
 
6.1%
2 3
 
6.1%
- 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 18
36.7%
9 5
 
10.2%
4 5
 
10.2%
5 4
 
8.2%
7 3
 
6.1%
6 3
 
6.1%
8 3
 
6.1%
3 3
 
6.1%
2 3
 
6.1%
- 1
 
2.0%
Distinct33
Distinct (%)89.2%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:16.390785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.8648649
Min length1

Characters and Unicode

Total characters106
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)83.8%

Sample

1st row7.8
2nd row-
3rd row14.3
4th row5.5
5th row7.9
ValueCountFrequency (%)
4
 
10.8%
7.9 2
 
5.4%
19 1
 
2.7%
14.4 1
 
2.7%
8.9 1
 
2.7%
7.1 1
 
2.7%
14 1
 
2.7%
4.2 1
 
2.7%
9.2 1
 
2.7%
7.8 1
 
2.7%
Other values (23) 23
62.2%
2023-12-11T15:13:16.692002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 28
26.4%
1 18
17.0%
7 10
 
9.4%
2 9
 
8.5%
9 8
 
7.5%
4 8
 
7.5%
8 7
 
6.6%
5 5
 
4.7%
- 4
 
3.8%
3 4
 
3.8%
Other values (2) 5
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 74
69.8%
Other Punctuation 28
 
26.4%
Dash Punctuation 4
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 18
24.3%
7 10
13.5%
2 9
12.2%
9 8
10.8%
4 8
10.8%
8 7
 
9.5%
5 5
 
6.8%
3 4
 
5.4%
6 3
 
4.1%
0 2
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 28
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 106
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 28
26.4%
1 18
17.0%
7 10
 
9.4%
2 9
 
8.5%
9 8
 
7.5%
4 8
 
7.5%
8 7
 
6.6%
5 5
 
4.7%
- 4
 
3.8%
3 4
 
3.8%
Other values (2) 5
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 28
26.4%
1 18
17.0%
7 10
 
9.4%
2 9
 
8.5%
9 8
 
7.5%
4 8
 
7.5%
8 7
 
6.6%
5 5
 
4.7%
- 4
 
3.8%
3 4
 
3.8%
Other values (2) 5
 
4.7%
Distinct35
Distinct (%)94.6%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:16.912253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.3513514
Min length7

Characters and Unicode

Total characters309
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)89.2%

Sample

1st row'74.8.15
2nd row'05.12.21
3rd row'80.10.31
4th row'82.12.23
5th row'83.9.16
ValueCountFrequency (%)
96.3.20 2
 
5.4%
85.10.18 2
 
5.4%
95.11.15 1
 
2.7%
96.8.12 1
 
2.7%
96.12.30 1
 
2.7%
00.8.7 1
 
2.7%
00.12.15 1
 
2.7%
01.3.9 1
 
2.7%
96.10.11 1
 
2.7%
96.3.30 1
 
2.7%
Other values (25) 25
67.6%
2023-12-11T15:13:17.269492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 74
23.9%
1 39
12.6%
' 37
12.0%
0 33
10.7%
2 32
10.4%
9 21
 
6.8%
8 17
 
5.5%
3 16
 
5.2%
5 14
 
4.5%
6 9
 
2.9%
Other values (2) 17
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 198
64.1%
Other Punctuation 111
35.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 39
19.7%
0 33
16.7%
2 32
16.2%
9 21
10.6%
8 17
8.6%
3 16
8.1%
5 14
 
7.1%
6 9
 
4.5%
4 9
 
4.5%
7 8
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 74
66.7%
' 37
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 309
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 74
23.9%
1 39
12.6%
' 37
12.0%
0 33
10.7%
2 32
10.4%
9 21
 
6.8%
8 17
 
5.5%
3 16
 
5.2%
5 14
 
4.5%
6 9
 
2.9%
Other values (2) 17
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 309
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 74
23.9%
1 39
12.6%
' 37
12.0%
0 33
10.7%
2 32
10.4%
9 21
 
6.8%
8 17
 
5.5%
3 16
 
5.2%
5 14
 
4.5%
6 9
 
2.9%
Other values (2) 17
 
5.5%

Interactions

2023-12-11T15:13:14.765251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:13:17.396548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회사호선역 수연 장(km)개통일
회사1.0001.0001.0000.6240.0000.928
호선1.0001.0001.0000.0000.0000.915
1.0001.0001.0001.0001.0001.000
역 수0.6240.0001.0001.0000.9830.898
연 장(km)0.0000.0001.0000.9831.0000.933
개통일0.9280.9151.0000.8980.9331.000
2023-12-11T15:13:17.526398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선회사
호선1.0000.907
회사0.9071.000

Missing values

2023-12-11T15:13:14.880968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:13:14.975382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회사호선역 수연 장(km)개통일
0서울메트로1서울↔청량리97.8'74.8.15
1서울메트로1동묘앞1-'05.12.21
2서울메트로2신설동(2)↔종합운동장1114.3'80.10.31
3서울메트로2종합운동장↔교대(2)55.5'82.12.23
4서울메트로2을지입구↔성수97.9'83.9.16
5서울메트로2교대(2)↔서울대입구56.7'83.12.17
6서울메트로2서울대입구↔을지입구1619.8'84.5.22
7서울메트로2신도림↔양천구청22.7'92.5.22
8서울메트로2양천구청↔신정네거리11.9'96.2.29
9서울메트로2신정네거리↔까치산-1.4'96.3.20
회사호선역 수연 장(km)개통일
27도시철도공사6이태원↔약수4-'01.3.9
28도시철도공사7장암↔건대입구1919'96.10.11
29도시철도공사7온수↔신풍89.2'00.2.29
30도시철도공사7온수↔부평구청910.2'12.10.27
31도시철도공사7건대입구↔신풍1518.7'00.8.1
32도시철도공사8잠실↔모란1313.1'96.11.23
33도시철도공사8암사↔잠실44.6'99.07.02
34메트로9호선9개화↔신논현2427'09.07.24
35메트로9호선9마곡나루역1-'14.05.24
36메트로9호선9신논현↔종합운동장54.7'15.3.28