Overview

Dataset statistics

Number of variables15
Number of observations39
Missing cells165
Missing cells (%)28.2%
Duplicate rows2
Duplicate rows (%)5.1%
Total size in memory4.7 KiB
Average record size in memory124.4 B

Variable types

Text3
Categorical1
Numeric1
Unsupported10

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13214/F/1/datasetView.do

Alerts

Dataset has 2 (5.1%) duplicate rowsDuplicates
구분 has 26 (66.7%) missing valuesMissing
설 비 명 has 21 (53.8%) missing valuesMissing
Unnamed: 2 has 7 (17.9%) missing valuesMissing
has 11 (28.2%) missing valuesMissing
1~4호선 has 10 (25.6%) missing valuesMissing
Unnamed: 6 has 10 (25.6%) missing valuesMissing
Unnamed: 7 has 10 (25.6%) missing valuesMissing
Unnamed: 8 has 10 (25.6%) missing valuesMissing
Unnamed: 9 has 10 (25.6%) missing valuesMissing
5~8호선 has 10 (25.6%) missing valuesMissing
Unnamed: 11 has 10 (25.6%) missing valuesMissing
Unnamed: 12 has 10 (25.6%) missing valuesMissing
Unnamed: 13 has 10 (25.6%) missing valuesMissing
Unnamed: 14 has 10 (25.6%) missing valuesMissing
1~4호선 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
5~8호선 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-29 16:44:45.086251
Analysis finished2024-04-29 16:44:47.315563
Duration2.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

MISSING 

Distinct11
Distinct (%)84.6%
Missing26
Missing (%)66.7%
Memory size444.0 B
2024-04-30T01:44:47.373932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.2307692
Min length1

Characters and Unicode

Total characters16
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)76.9%

Sample

1st row
2nd row
3rd row
4th row
5th row역사
ValueCountFrequency (%)
3
23.1%
1
 
7.7%
1
 
7.7%
1
 
7.7%
역사 1
 
7.7%
전기 1
 
7.7%
설비 1
 
7.7%
1
 
7.7%
1
 
7.7%
1
 
7.7%
2024-04-30T01:44:47.577247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
25.0%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
25.0%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
25.0%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
25.0%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%

설 비 명
Text

MISSING 

Distinct15
Distinct (%)83.3%
Missing21
Missing (%)53.8%
Memory size444.0 B
2024-04-30T01:44:47.721554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length5
Mean length2.9444444
Min length1

Characters and Unicode

Total characters53
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)66.7%

Sample

1st row변전소
2nd row정류기
3rd row변압기
4th row차단기
5th row원제반
ValueCountFrequency (%)
변압기 2
 
11.1%
차단기 2
 
11.1%
2
 
11.1%
변전소 1
 
5.6%
정류기 1
 
5.6%
원제반 1
 
5.6%
담당역사 1
 
5.6%
역사전기실 1
 
5.6%
본선(터널)전기실 1
 
5.6%
강체 1
 
5.6%
Other values (5) 5
27.8%
2024-04-30T01:44:47.979910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
13.2%
5
 
9.4%
3
 
5.7%
2
 
3.8%
2
 
3.8%
2
 
3.8%
2
 
3.8%
2
 
3.8%
2
 
3.8%
2
 
3.8%
Other values (24) 24
45.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51
96.2%
Close Punctuation 1
 
1.9%
Open Punctuation 1
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
13.7%
5
 
9.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (22) 22
43.1%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51
96.2%
Common 2
 
3.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
13.7%
5
 
9.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (22) 22
43.1%
Common
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51
96.2%
ASCII 2
 
3.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
13.7%
5
 
9.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (22) 22
43.1%
ASCII
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

Unnamed: 2
Text

MISSING 

Distinct25
Distinct (%)78.1%
Missing7
Missing (%)17.9%
Memory size444.0 B
2024-04-30T01:44:48.125447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length3.5
Min length1

Characters and Unicode

Total characters112
Distinct characters47
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)65.6%

Sample

1st row
2nd row수전용
3rd row연락용
4th row실리콘
5th row
ValueCountFrequency (%)
5
 
15.6%
vcb 2
 
6.2%
전기실 2
 
6.2%
역사 2
 
6.2%
22.9kv)연장 2
 
6.2%
rtu 1
 
3.1%
6.6kv 1
 
3.1%
지상부 1
 
3.1%
지하부 1
 
3.1%
터널용 1
 
3.1%
Other values (14) 14
43.8%
2024-04-30T01:44:48.373011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 8
 
7.1%
( 8
 
7.1%
7
 
6.2%
6
 
5.4%
5
 
4.5%
V 5
 
4.5%
2 4
 
3.6%
4
 
3.6%
3
 
2.7%
. 3
 
2.7%
Other values (37) 59
52.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69
61.6%
Uppercase Letter 12
 
10.7%
Close Punctuation 8
 
7.1%
Open Punctuation 8
 
7.1%
Decimal Number 8
 
7.1%
Other Punctuation 4
 
3.6%
Lowercase Letter 3
 
2.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
10.1%
6
 
8.7%
5
 
7.2%
4
 
5.8%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
2
 
2.9%
Other values (23) 30
43.5%
Uppercase Letter
ValueCountFrequency (%)
V 5
41.7%
C 2
 
16.7%
B 2
 
16.7%
U 1
 
8.3%
T 1
 
8.3%
R 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
2 4
50.0%
6 2
25.0%
9 2
25.0%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Lowercase Letter
ValueCountFrequency (%)
k 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69
61.6%
Common 28
25.0%
Latin 15
 
13.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
10.1%
6
 
8.7%
5
 
7.2%
4
 
5.8%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
2
 
2.9%
Other values (23) 30
43.5%
Common
ValueCountFrequency (%)
) 8
28.6%
( 8
28.6%
2 4
14.3%
. 3
 
10.7%
6 2
 
7.1%
9 2
 
7.1%
, 1
 
3.6%
Latin
ValueCountFrequency (%)
V 5
33.3%
k 3
20.0%
C 2
 
13.3%
B 2
 
13.3%
U 1
 
6.7%
T 1
 
6.7%
R 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69
61.6%
ASCII 43
38.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 8
18.6%
( 8
18.6%
V 5
11.6%
2 4
9.3%
. 3
 
7.0%
k 3
 
7.0%
6 2
 
4.7%
9 2
 
4.7%
C 2
 
4.7%
B 2
 
4.7%
Other values (4) 4
9.3%
Hangul
ValueCountFrequency (%)
7
 
10.1%
6
 
8.7%
5
 
7.2%
4
 
5.8%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
2
 
2.9%
Other values (23) 30
43.5%

단위
Categorical

Distinct5
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Memory size444.0 B
14 
<NA>
11 
km
개소
 
1

Length

Max length4
Median length2
Mean length2.1794872
Min length1

Unique

Unique1 ?
Unique (%)2.6%

Sample

1st row<NA>
2nd row개소
3rd row개소
4th row개소
5th row

Common Values

ValueCountFrequency (%)
14
35.9%
<NA> 11
28.2%
km 7
17.9%
개소 6
15.4%
1
 
2.6%

Length

2024-04-30T01:44:48.488493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:44:48.587768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
14
35.9%
na 11
28.2%
km 7
17.9%
개소 6
15.4%
1
 
2.6%


Real number (ℝ)

MISSING 

Distinct26
Distinct (%)92.9%
Missing11
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean960.53571
Minimum10
Maximum4437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2024-04-30T01:44:48.681130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile91.5
Q1198.5
median455
Q31439.25
95-th percentile2747.3
Maximum4437
Range4427
Interquartile range (IQR)1240.75

Descriptive statistics

Standard deviation1084.8095
Coefficient of variation (CV)1.1293797
Kurtosis2.5677552
Mean960.53571
Median Absolute Deviation (MAD)356
Skewness1.6327804
Sum26895
Variance1176811.7
MonotonicityNot monotonic
2024-04-30T01:44:48.782289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
309 2
 
5.1%
185 2
 
5.1%
10 1
 
2.6%
704 1
 
2.6%
2832 1
 
2.6%
996 1
 
2.6%
609 1
 
2.6%
4437 1
 
2.6%
329 1
 
2.6%
581 1
 
2.6%
Other values (16) 16
41.0%
(Missing) 11
28.2%
ValueCountFrequency (%)
10 1
2.6%
88 1
2.6%
98 1
2.6%
100 1
2.6%
185 2
5.1%
194 1
2.6%
200 1
2.6%
269 1
2.6%
278 1
2.6%
302 1
2.6%
ValueCountFrequency (%)
4437 1
2.6%
2832 1
2.6%
2590 1
2.6%
2321 1
2.6%
2224 1
2.6%
1898 1
2.6%
1713 1
2.6%
1348 1
2.6%
996 1
2.6%
910 1
2.6%

1~4호선
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

5~8호선
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10
Missing (%)25.6%
Memory size444.0 B

Interactions

2024-04-30T01:44:46.656329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T01:44:48.862346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분설 비 명Unnamed: 2단위
구분1.0000.9770.9640.8290.799
설 비 명0.9771.0001.0001.0000.000
Unnamed: 20.9641.0001.0000.9130.000
단위0.8291.0000.9131.0000.540
0.7990.0000.0000.5401.000
2024-04-30T01:44:48.985312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위
1.0000.221
단위0.2211.000

Missing values

2024-04-30T01:44:46.830418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T01:44:47.004618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T01:44:47.174013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분설 비 명Unnamed: 2단위1~4호선Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 95~8호선Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14
0<NA><NA><NA><NA><NA>소계1호선2호선3호선4호선소계5호선6호선7호선8호선
1변전소개소98423151311561912205
2<NA>수전용개소88423151311461412173
3<NA>연락용개소10-----105-32
4정류기실리콘3091451159413416457355715
5<NA>변압기7043122211693813921358313836
6<NA><NA>정류용3091451159413416457355715
7<NA><NA>(전차선)<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8<NA><NA>배전용19483527262511238244010
9<NA><NA>(역사)<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
구분설 비 명Unnamed: 2단위1~4호선Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 95~8호선Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14
29<NA><NA>(VCB)<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
30<NA>km91043621193129934741579216857
31강체지하부km581238189077533431157211540
32카테지상부km3291983103524013142205317
33<NA>나리<NA><NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
34<NA>km443719891108205964632448833485804326
35수전(22.9kV)연장km609238127677743711565210064
36연락(22.9kV)연장km9964351818812210756119110119871
37<NA>배전(6.6kV)km28321316815563972821516487332506191
38<NA><NA>연장<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

구분설 비 명Unnamed: 2단위# duplicates
0<NA><NA>(VCB)<NA><NA>2
1<NA><NA>전기실<NA><NA>2