Overview

Dataset statistics

Number of variables6
Number of observations37
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory52.6 B

Variable types

Categorical2
Numeric1
Text3

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13317/F/1/datasetView.do

Alerts

호선 is highly overall correlated with 회사High correlation
회사 is highly overall correlated with 호선High correlation
has unique valuesUnique

Reproduction

Analysis started2023-12-11 06:13:10.352862
Analysis finished2023-12-11 06:13:10.969255
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회사
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size428.0 B
서울메트로
20 
도시철도공사
14 
서울교통공사
 
2
서울메트로9호선운영
 
1

Length

Max length10
Median length5
Mean length5.5675676
Min length5

Unique

Unique1 ?
Unique (%)2.7%

Sample

1st row서울메트로
2nd row서울메트로
3rd row서울메트로
4th row서울메트로
5th row서울메트로

Common Values

ValueCountFrequency (%)
서울메트로 20
54.1%
도시철도공사 14
37.8%
서울교통공사 2
 
5.4%
서울메트로9호선운영 1
 
2.7%

Length

2023-12-11T15:13:11.070414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:13:11.209840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울메트로 20
54.1%
도시철도공사 14
37.8%
서울교통공사 2
 
5.4%
서울메트로9호선운영 1
 
2.7%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3783784
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size465.0 B
2023-12-11T15:13:11.330747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.8
Q12
median4
Q36
95-th percentile8.2
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.3136233
Coefficient of variation (CV)0.52842014
Kurtosis-0.92997913
Mean4.3783784
Median Absolute Deviation (MAD)2
Skewness0.40605877
Sum162
Variance5.3528529
MonotonicityIncreasing
2023-12-11T15:13:11.492032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 9
24.3%
3 5
13.5%
5 5
13.5%
4 4
10.8%
6 4
10.8%
7 4
10.8%
1 2
 
5.4%
8 2
 
5.4%
9 2
 
5.4%
ValueCountFrequency (%)
1 2
 
5.4%
2 9
24.3%
3 5
13.5%
4 4
10.8%
5 5
13.5%
6 4
10.8%
7 4
10.8%
8 2
 
5.4%
9 2
 
5.4%
ValueCountFrequency (%)
9 2
 
5.4%
8 2
 
5.4%
7 4
10.8%
6 4
10.8%
5 5
13.5%
4 4
10.8%
3 5
13.5%
2 9
24.3%
1 2
 
5.4%


Text

UNIQUE 

Distinct37
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:11.750313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length7.2162162
Min length2

Characters and Unicode

Total characters267
Distinct characters93
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st row서울↔청량리
2nd row동묘앞
3rd row신설동(2)↔종합운동장
4th row종합운동장↔교대(2)
5th row을지입구↔성수
ValueCountFrequency (%)
서울↔청량리 1
 
2.7%
사당↔남태령(시계 1
 
2.7%
방화↔까치산 1
 
2.7%
강동↔마천 1
 
2.7%
까치산↔여의도 1
 
2.7%
여의도↔왕십리 1
 
2.7%
봉화산↔상월곡 1
 
2.7%
응암↔상월곡 1
 
2.7%
이태원↔약수 1
 
2.7%
봉화산↔신내 1
 
2.7%
Other values (27) 27
73.0%
2023-12-11T15:13:12.109303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
13.1%
13
 
4.9%
8
 
3.0%
8
 
3.0%
8
 
3.0%
8
 
3.0%
6
 
2.2%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (83) 166
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 218
81.6%
Math Symbol 35
 
13.1%
Close Punctuation 5
 
1.9%
Open Punctuation 5
 
1.9%
Decimal Number 4
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
6.0%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (78) 147
67.4%
Decimal Number
ValueCountFrequency (%)
2 3
75.0%
9 1
 
25.0%
Math Symbol
ValueCountFrequency (%)
35
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 218
81.6%
Common 49
 
18.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
6.0%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (78) 147
67.4%
Common
ValueCountFrequency (%)
35
71.4%
) 5
 
10.2%
( 5
 
10.2%
2 3
 
6.1%
9 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 218
81.6%
Arrows 35
 
13.1%
ASCII 14
 
5.2%

Most frequent character per block

Arrows
ValueCountFrequency (%)
35
100.0%
Hangul
ValueCountFrequency (%)
13
 
6.0%
8
 
3.7%
8
 
3.7%
8
 
3.7%
8
 
3.7%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (78) 147
67.4%
ASCII
ValueCountFrequency (%)
) 5
35.7%
( 5
35.7%
2 3
21.4%
9 1
 
7.1%

역 수
Categorical

Distinct18
Distinct (%)48.6%
Missing0
Missing (%)0.0%
Memory size428.0 B
1
9
5
7
8
Other values (13)
17 

Length

Max length2
Median length1
Mean length1.2972973
Min length1

Unique

Unique9 ?
Unique (%)24.3%

Sample

1st row9
2nd row1
3rd row11
4th row5
5th row9

Common Values

ValueCountFrequency (%)
1 7
18.9%
9 4
10.8%
5 3
 
8.1%
7 3
 
8.1%
8 3
 
8.1%
13 2
 
5.4%
16 2
 
5.4%
4 2
 
5.4%
14 2
 
5.4%
3 1
 
2.7%
Other values (8) 8
21.6%

Length

2023-12-11T15:13:12.467507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 7
18.9%
9 4
10.8%
5 3
 
8.1%
7 3
 
8.1%
8 3
 
8.1%
13 2
 
5.4%
16 2
 
5.4%
4 2
 
5.4%
14 2
 
5.4%
19 1
 
2.7%
Other values (8) 8
21.6%
Distinct33
Distinct (%)89.2%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:12.627073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9459459
Min length1

Characters and Unicode

Total characters109
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)81.1%

Sample

1st row7.8
2nd row-
3rd row14.3
4th row5.5
5th row7.9
ValueCountFrequency (%)
3
 
8.1%
8.9 2
 
5.4%
7.9 2
 
5.4%
1.3 1
 
2.7%
14.4 1
 
2.7%
7.1 1
 
2.7%
14 1
 
2.7%
4.2 1
 
2.7%
16.5 1
 
2.7%
7.8 1
 
2.7%
Other values (23) 23
62.2%
2023-12-11T15:13:12.951279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 30
27.5%
1 19
17.4%
7 9
 
8.3%
9 9
 
8.3%
8 8
 
7.3%
4 8
 
7.3%
2 8
 
7.3%
3 5
 
4.6%
5 5
 
4.6%
- 3
 
2.8%
Other values (2) 5
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76
69.7%
Other Punctuation 30
 
27.5%
Dash Punctuation 3
 
2.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 19
25.0%
7 9
11.8%
9 9
11.8%
8 8
10.5%
4 8
10.5%
2 8
10.5%
3 5
 
6.6%
5 5
 
6.6%
6 3
 
3.9%
0 2
 
2.6%
Other Punctuation
ValueCountFrequency (%)
. 30
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 109
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 30
27.5%
1 19
17.4%
7 9
 
8.3%
9 9
 
8.3%
8 8
 
7.3%
4 8
 
7.3%
2 8
 
7.3%
3 5
 
4.6%
5 5
 
4.6%
- 3
 
2.8%
Other values (2) 5
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 30
27.5%
1 19
17.4%
7 9
 
8.3%
9 9
 
8.3%
8 8
 
7.3%
4 8
 
7.3%
2 8
 
7.3%
3 5
 
4.6%
5 5
 
4.6%
- 3
 
2.8%
Other values (2) 5
 
4.6%
Distinct35
Distinct (%)94.6%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T15:13:13.158032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.3783784
Min length7

Characters and Unicode

Total characters310
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)89.2%

Sample

1st row'74.8.15
2nd row'05.12.21
3rd row'80.10.31
4th row'82.12.23
5th row'83.9.16
ValueCountFrequency (%)
96.3.20 2
 
5.4%
85.10.18 2
 
5.4%
95.11.15 1
 
2.7%
96.8.12 1
 
2.7%
96.12.30 1
 
2.7%
00.8.7 1
 
2.7%
00.12.15 1
 
2.7%
01.3.9 1
 
2.7%
19.12.21 1
 
2.7%
96.3.30 1
 
2.7%
Other values (25) 25
67.6%
2023-12-11T15:13:13.515011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 74
23.9%
1 44
14.2%
' 37
11.9%
2 33
10.6%
0 30
9.7%
9 21
 
6.8%
8 18
 
5.8%
3 16
 
5.2%
5 13
 
4.2%
6 9
 
2.9%
Other values (3) 15
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 197
63.5%
Other Punctuation 111
35.8%
Space Separator 2
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 44
22.3%
2 33
16.8%
0 30
15.2%
9 21
10.7%
8 18
9.1%
3 16
 
8.1%
5 13
 
6.6%
6 9
 
4.6%
7 7
 
3.6%
4 6
 
3.0%
Other Punctuation
ValueCountFrequency (%)
. 74
66.7%
' 37
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 310
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 74
23.9%
1 44
14.2%
' 37
11.9%
2 33
10.6%
0 30
9.7%
9 21
 
6.8%
8 18
 
5.8%
3 16
 
5.2%
5 13
 
4.2%
6 9
 
2.9%
Other values (3) 15
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 74
23.9%
1 44
14.2%
' 37
11.9%
2 33
10.6%
0 30
9.7%
9 21
 
6.8%
8 18
 
5.8%
3 16
 
5.2%
5 13
 
4.2%
6 9
 
2.9%
Other values (3) 15
 
4.8%

Interactions

2023-12-11T15:13:10.677028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:13:13.603185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회사호선역 수연 장(km)개통일
회사1.0000.8301.0000.0000.0000.968
호선0.8301.0001.0000.0000.1000.915
1.0001.0001.0001.0001.0001.000
역 수0.0000.0001.0001.0000.9740.867
연 장(km)0.0000.1001.0000.9741.0000.960
개통일0.9680.9151.0000.8670.9601.000
2023-12-11T15:13:13.702879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회사역 수
회사1.0000.000
역 수0.0001.000
2023-12-11T15:13:13.785490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선회사역 수
호선1.0000.6480.000
회사0.6481.0000.000
역 수0.0000.0001.000

Missing values

2023-12-11T15:13:10.806669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:13:10.920628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회사호선역 수연 장(km)개통일
0서울메트로1서울↔청량리97.8'74.8.15
1서울메트로1동묘앞1-'05.12.21
2서울메트로2신설동(2)↔종합운동장1114.3'80.10.31
3서울메트로2종합운동장↔교대(2)55.5'82.12.23
4서울메트로2을지입구↔성수97.9'83.9.16
5서울메트로2교대(2)↔서울대입구56.7'83.12.17
6서울메트로2서울대입구↔을지입구1619.8'84.5.22
7서울메트로2신도림↔양천구청22.7'92.5.22
8서울메트로2양천구청↔신정네거리11.9'96.2.29
9서울메트로2신정네거리↔까치산-1.4'96.3.20
회사호선역 수연 장(km)개통일
27도시철도공사6이태원↔약수4-'01.3.9
28서울교통공사6봉화산↔신내11.3'19.12.21
29도시철도공사7장암↔건대입구1919'96.10.11
30도시철도공사7온수↔신풍89.2'00.2.29
31도시철도공사7온수↔부평구청910.2'12.10.27
32도시철도공사7건대입구↔신풍1518.7'00.8.1
33도시철도공사8잠실↔모란1313.1'96.11.23
34도시철도공사8암사↔잠실44.6'99.07.02
35서울메트로9호선운영9신논현↔종합운동장54.7'15.3.28
36서울교통공사9종합운동장(9)↔중앙보훈병원88.9'18.12.1