Overview

Dataset statistics

Number of variables7
Number of observations126
Missing cells130
Missing cells (%)14.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.1 KiB
Average record size in memory58.0 B

Variable types

Text2
Categorical2
Unsupported2
Numeric1

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11572/S/1/datasetView.do

Alerts

시 설 개 요 is highly overall correlated with Unnamed: 4High correlation
Unnamed: 4 is highly overall correlated with 시 설 개 요High correlation
시 설 명 has 118 (93.7%) missing valuesMissing
Unnamed: 3 has 5 (4.0%) missing valuesMissing
준공년도 has 6 (4.8%) missing valuesMissing
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-29 15:52:18.094300
Analysis finished2024-04-29 15:52:19.034643
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시 설 명
Text

MISSING 

Distinct6
Distinct (%)75.0%
Missing118
Missing (%)93.7%
Memory size1.1 KiB
2024-04-30T00:52:19.119562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.625
Min length3

Characters and Unicode

Total characters29
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)50.0%

Sample

1st row호 선
2nd row총 계 120역
3rd row1호선
4th row2호선
5th row2호선
ValueCountFrequency (%)
2호선 2
18.2%
3호선 2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
120역 1
9.1%
1호선 1
9.1%
4호선 1
9.1%
2024-04-30T00:52:19.386838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
24.1%
7
24.1%
2 3
10.3%
3
10.3%
3 2
 
6.9%
1 2
 
6.9%
1
 
3.4%
1
 
3.4%
0 1
 
3.4%
1
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17
58.6%
Decimal Number 9
31.0%
Space Separator 3
 
10.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
41.2%
7
41.2%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Decimal Number
ValueCountFrequency (%)
2 3
33.3%
3 2
22.2%
1 2
22.2%
0 1
 
11.1%
4 1
 
11.1%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17
58.6%
Common 12
41.4%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
25.0%
3
25.0%
3 2
16.7%
1 2
16.7%
0 1
 
8.3%
4 1
 
8.3%
Hangul
ValueCountFrequency (%)
7
41.2%
7
41.2%
1
 
5.9%
1
 
5.9%
1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17
58.6%
ASCII 12
41.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
41.2%
7
41.2%
1
 
5.9%
1
 
5.9%
1
 
5.9%
ASCII
ValueCountFrequency (%)
2 3
25.0%
3
25.0%
3 2
16.7%
1 2
16.7%
0 1
 
8.3%
4 1
 
8.3%
Distinct116
Distinct (%)92.8%
Missing1
Missing (%)0.8%
Memory size1.1 KiB
2024-04-30T00:52:19.761930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.264
Min length2

Characters and Unicode

Total characters533
Distinct characters146
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)85.6%

Sample

1st row역 명
2nd row소계 10역
3rd row서 울 역
4th row시 청
5th row종 각
ValueCountFrequency (%)
15
 
5.4%
12
 
4.3%
10
 
3.6%
8
 
2.9%
6
 
2.2%
5
 
1.8%
5
 
1.8%
5
 
1.8%
4
 
1.4%
4
 
1.4%
Other values (133) 204
73.4%
2024-04-30T00:52:20.186163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
159
29.8%
21
 
3.9%
15
 
2.8%
15
 
2.8%
13
 
2.4%
11
 
2.1%
11
 
2.1%
9
 
1.7%
9
 
1.7%
8
 
1.5%
Other values (136) 262
49.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 360
67.5%
Space Separator 159
29.8%
Decimal Number 14
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
5.8%
15
 
4.2%
15
 
4.2%
13
 
3.6%
11
 
3.1%
11
 
3.1%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (128) 241
66.9%
Decimal Number
ValueCountFrequency (%)
3 5
35.7%
4 2
 
14.3%
5 2
 
14.3%
0 2
 
14.3%
6 1
 
7.1%
2 1
 
7.1%
1 1
 
7.1%
Space Separator
ValueCountFrequency (%)
159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 360
67.5%
Common 173
32.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
5.8%
15
 
4.2%
15
 
4.2%
13
 
3.6%
11
 
3.1%
11
 
3.1%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (128) 241
66.9%
Common
ValueCountFrequency (%)
159
91.9%
3 5
 
2.9%
4 2
 
1.2%
5 2
 
1.2%
0 2
 
1.2%
6 1
 
0.6%
2 1
 
0.6%
1 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 360
67.5%
ASCII 173
32.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
159
91.9%
3 5
 
2.9%
4 2
 
1.2%
5 2
 
1.2%
0 2
 
1.2%
6 1
 
0.6%
2 1
 
0.6%
1 1
 
0.6%
Hangul
ValueCountFrequency (%)
21
 
5.8%
15
 
4.2%
15
 
4.2%
13
 
3.6%
11
 
3.1%
11
 
3.1%
9
 
2.5%
9
 
2.5%
8
 
2.2%
7
 
1.9%
Other values (128) 241
66.9%

시 설 개 요
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
상대식
82 
섬식
33 
<NA>
 
5
섬 식
 
5
형 식
 
1

Length

Max length4
Median length3
Mean length2.7777778
Min length2

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row형 식
2nd row<NA>
3rd row<NA>
4th row섬 식
5th row상대식

Common Values

ValueCountFrequency (%)
상대식 82
65.1%
섬식 33
26.2%
<NA> 5
 
4.0%
섬 식 5
 
4.0%
형 식 1
 
0.8%

Length

2024-04-30T00:52:20.318838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:52:20.428149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상대식 82
62.1%
섬식 33
25.0%
6
 
4.5%
na 5
 
3.8%
5
 
3.8%
1
 
0.8%

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)4.0%
Memory size1.1 KiB

Unnamed: 4
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2층
79 
3층
27 
4층
11 
<NA>
 
5
층 수
 
1
Other values (3)
 
3

Length

Max length4
Median length2
Mean length2.0873016
Min length2

Unique

Unique4 ?
Unique (%)3.2%

Sample

1st row층 수
2nd row<NA>
3rd row<NA>
4th row2층
5th row2층

Common Values

ValueCountFrequency (%)
2층 79
62.7%
3층 27
 
21.4%
4층 11
 
8.7%
<NA> 5
 
4.0%
층 수 1
 
0.8%
6층 1
 
0.8%
1층 1
 
0.8%
5층 1
 
0.8%

Length

2024-04-30T00:52:20.564935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:52:20.710693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2층 79
62.2%
3층 27
 
21.3%
4층 11
 
8.7%
na 5
 
3.9%
1
 
0.8%
1
 
0.8%
6층 1
 
0.8%
1층 1
 
0.8%
5층 1
 
0.8%

Unnamed: 5
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size1.1 KiB

준공년도
Real number (ℝ)

MISSING 

Distinct13
Distinct (%)10.8%
Missing6
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean1985.0583
Minimum1974
Maximum2010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2024-04-30T00:52:20.816066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1974
5-th percentile1974
Q11983
median1985
Q31985
95-th percentile1994.1
Maximum2010
Range36
Interquartile range (IQR)2

Descriptive statistics

Standard deviation6.4720418
Coefficient of variation (CV)0.0032603786
Kurtosis5.372497
Mean1985.0583
Median Absolute Deviation (MAD)1
Skewness1.7457051
Sum238207
Variance41.887325
MonotonicityNot monotonic
2024-04-30T00:52:20.921160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
1985 47
37.3%
1984 17
 
13.5%
1983 14
 
11.1%
1980 11
 
8.7%
1974 9
 
7.1%
1993 8
 
6.3%
1982 4
 
3.2%
2010 3
 
2.4%
2005 2
 
1.6%
1992 2
 
1.6%
Other values (3) 3
 
2.4%
(Missing) 6
 
4.8%
ValueCountFrequency (%)
1974 9
 
7.1%
1980 11
 
8.7%
1982 4
 
3.2%
1983 14
 
11.1%
1984 17
 
13.5%
1985 47
37.3%
1990 1
 
0.8%
1992 2
 
1.6%
1993 8
 
6.3%
1994 1
 
0.8%
ValueCountFrequency (%)
2010 3
 
2.4%
2005 2
 
1.6%
1996 1
 
0.8%
1994 1
 
0.8%
1993 8
 
6.3%
1992 2
 
1.6%
1990 1
 
0.8%
1985 47
37.3%
1984 17
 
13.5%
1983 14
 
11.1%

Interactions

2024-04-30T00:52:18.349025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T00:52:21.012118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시 설 명시 설 개 요Unnamed: 4준공년도
시 설 명1.0001.0001.0000.000
시 설 개 요1.0001.0000.7130.236
Unnamed: 41.0000.7131.0000.708
준공년도0.0000.2360.7081.000
2024-04-30T00:52:21.103630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 4시 설 개 요
Unnamed: 41.0000.575
시 설 개 요0.5751.000
2024-04-30T00:52:21.180673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
준공년도시 설 개 요Unnamed: 4
준공년도1.0000.2290.490
시 설 개 요0.2291.0000.575
Unnamed: 40.4900.5751.000

Missing values

2024-04-30T00:52:18.752701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T00:52:18.848390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T00:52:18.965368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시 설 명Unnamed: 1시 설 개 요Unnamed: 3Unnamed: 4Unnamed: 5준공년도
0호 선역 명형 식길이(M)층 수면적(㎡)<NA>
1총 계 120역<NA><NA>NaN<NA>996371<NA>
21호선소계 10역<NA>NaN<NA>87594<NA>
3<NA>서 울 역섬 식2102층103351974
4<NA>시 청상대식2102층104211974
5<NA>종 각상대식2102층90721974
6<NA>종 로 3 가상대식2102층93111974
7<NA>종 로 5 가상대식2102층104651974
8<NA>동 대 문상대식2102층54901974
9<NA>동 묘 앞상대식2106층94732005
시 설 명Unnamed: 1시 설 개 요Unnamed: 3Unnamed: 4Unnamed: 5준공년도
116<NA>회 현섬식2054층110731985
117<NA>서 울 역섬식2052층95641985
118<NA>숙 대 입 구상대식2052층70271985
119<NA>삼 각 지상대식2052층91641985
120<NA>신 용 산상대식2052층64391985
121<NA>이 촌상대식2052층75231985
122<NA>동 작상대식2053층141391985
123<NA>총신대입구상대식2052층75961985
124<NA>사 당섬식2053층154901985
125<NA>남 태 령섬식2053층61201994