Overview

Dataset statistics

Number of variables13
Number of observations26
Missing cells25
Missing cells (%)7.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory110.1 B

Variable types

Text1
Categorical1
Numeric1
Unsupported10

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13293/F/1/datasetView.do

Alerts

단위 is highly imbalanced (51.5%)Imbalance
시 설 명 has 2 (7.7%) missing valuesMissing
has 3 (11.5%) missing valuesMissing
1~4호선 has 2 (7.7%) missing valuesMissing
Unnamed: 4 has 2 (7.7%) missing valuesMissing
Unnamed: 5 has 2 (7.7%) missing valuesMissing
Unnamed: 6 has 2 (7.7%) missing valuesMissing
Unnamed: 7 has 2 (7.7%) missing valuesMissing
5~8호선 has 2 (7.7%) missing valuesMissing
Unnamed: 9 has 2 (7.7%) missing valuesMissing
Unnamed: 10 has 2 (7.7%) missing valuesMissing
Unnamed: 11 has 2 (7.7%) missing valuesMissing
Unnamed: 12 has 2 (7.7%) missing valuesMissing
1~4호선 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
5~8호선 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-17 04:25:41.011888
Analysis finished2024-04-17 04:25:41.631495
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시 설 명
Text

MISSING 

Distinct24
Distinct (%)100.0%
Missing2
Missing (%)7.7%
Memory size340.0 B
2024-04-17T13:25:41.727597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length6.0833333
Min length3

Characters and Unicode

Total characters146
Distinct characters75
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)100.0%

Sample

1st row승강장안전문
2nd row헤더박스
3rd row개별제어반
4th row가동문
5th row승무원 출입문
ValueCountFrequency (%)
출입문 3
 
9.7%
승강장안전문 1
 
3.2%
hmi 1
 
3.2%
정위치검지장치 1
 
3.2%
시스템 1
 
3.2%
역사psd감시 1
 
3.2%
유지보수전산기 1
 
3.2%
전원공급실 1
 
3.2%
전원장치 1
 
3.2%
검지센서 1
 
3.2%
Other values (19) 19
61.3%
2024-04-17T13:25:41.982560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
5.5%
7
 
4.8%
6
 
4.1%
6
 
4.1%
6
 
4.1%
5
 
3.4%
5
 
3.4%
4
 
2.7%
4
 
2.7%
4
 
2.7%
Other values (65) 91
62.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 126
86.3%
Uppercase Letter 11
 
7.5%
Space Separator 7
 
4.8%
Open Punctuation 1
 
0.7%
Close Punctuation 1
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
6.3%
6
 
4.8%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 74
58.7%
Uppercase Letter
ValueCountFrequency (%)
D 2
18.2%
P 1
9.1%
S 1
9.1%
H 1
9.1%
M 1
9.1%
I 1
9.1%
R 1
9.1%
F 1
9.1%
L 1
9.1%
E 1
9.1%
Space Separator
ValueCountFrequency (%)
7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 126
86.3%
Latin 11
 
7.5%
Common 9
 
6.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
6.3%
6
 
4.8%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 74
58.7%
Latin
ValueCountFrequency (%)
D 2
18.2%
P 1
9.1%
S 1
9.1%
H 1
9.1%
M 1
9.1%
I 1
9.1%
R 1
9.1%
F 1
9.1%
L 1
9.1%
E 1
9.1%
Common
ValueCountFrequency (%)
7
77.8%
( 1
 
11.1%
) 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 126
86.3%
ASCII 20
 
13.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
6.3%
6
 
4.8%
6
 
4.8%
6
 
4.8%
5
 
4.0%
5
 
4.0%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
Other values (52) 74
58.7%
ASCII
ValueCountFrequency (%)
7
35.0%
D 2
 
10.0%
P 1
 
5.0%
S 1
 
5.0%
( 1
 
5.0%
H 1
 
5.0%
M 1
 
5.0%
I 1
 
5.0%
) 1
 
5.0%
R 1
 
5.0%
Other values (3) 3
15.0%

단위
Categorical

IMBALANCE 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
21 
<NA>
 
1
트랙
 
1

Length

Max length4
Median length1
Mean length1.3846154
Min length1

Unique

Unique2 ?
Unique (%)7.7%

Sample

1st row<NA>
2nd row
3rd row트랙
4th row
5th row

Common Values

ValueCountFrequency (%)
21
80.8%
<NA> 3
 
11.5%
1
 
3.8%
트랙 1
 
3.8%

Length

2024-04-17T13:25:42.093633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T13:25:42.189106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
21
80.8%
na 3
 
11.5%
1
 
3.8%
트랙 1
 
3.8%


Real number (ℝ)

MISSING 

Distinct19
Distinct (%)82.6%
Missing3
Missing (%)11.5%
Infinite0
Infinite (%)0.0%
Mean4873.4783
Minimum14
Maximum19752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2024-04-17T13:25:42.262427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile163.2
Q1382.5
median573
Q32769.5
95-th percentile19746.4
Maximum19752
Range19738
Interquartile range (IQR)2387

Descriptive statistics

Standard deviation8044.1637
Coefficient of variation (CV)1.6506001
Kurtosis0.095694094
Mean4873.4783
Median Absolute Deviation (MAD)295
Skewness1.4297263
Sum112090
Variance64708570
MonotonicityNot monotonic
2024-04-17T13:25:42.354636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
19696 3
 
11.5%
19752 2
 
7.7%
572 2
 
7.7%
4396 1
 
3.8%
575 1
 
3.8%
154 1
 
3.8%
14 1
 
3.8%
365 1
 
3.8%
567 1
 
3.8%
573 1
 
3.8%
Other values (9) 9
34.6%
(Missing) 3
 
11.5%
ValueCountFrequency (%)
14 1
3.8%
154 1
3.8%
246 1
3.8%
278 1
3.8%
292 1
3.8%
365 1
3.8%
400 1
3.8%
567 1
3.8%
569 1
3.8%
572 2
7.7%
ValueCountFrequency (%)
19752 2
7.7%
19696 3
11.5%
4396 1
 
3.8%
1143 1
 
3.8%
1138 1
 
3.8%
825 1
 
3.8%
819 1
 
3.8%
575 1
 
3.8%
573 1
 
3.8%
572 2
7.7%

1~4호선
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

5~8호선
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)7.7%
Memory size340.0 B

Interactions

2024-04-17T13:25:41.144798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T13:25:42.428170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시 설 명단위
시 설 명1.0001.0001.000
단위1.0001.0000.000
1.0000.0001.000
2024-04-17T13:25:42.491641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단위
1.0000.000
단위0.0001.000

Missing values

2024-04-17T13:25:41.261891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T13:25:41.387302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-17T13:25:41.513430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시 설 명단위1~4호선Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 75~8호선Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
0<NA><NA><NA>소계1호선2호선3호선4호선소계5호선6호선7호선8호선
1승강장안전문2781201050342615851395117
2<NA>트랙5722462010569523261077710834
3헤더박스19696953680038962760208010160342424643456816
4개별제어반19696953680038962760208010160342424643456816
5가동문19696953680038962760208010160342424643456816
6승무원 출입문40067061513331087811334
7선로 출입문11434914020913810465221415421668
8선로출입문 제어장치11384864020513710465221415421668
9종합제어반2921241053352616854425517
시 설 명단위1~4호선Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 75~8호선Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
16역명LED 표시창439639323001312134098046416016288-
17출입문 개폐등19752959280039522760208010160342424643456816
18레이저거리 검지센서5732462010569523271077710934
19전원장치5672411910169523261077710834
20전원공급실3651361159402622984398818
21유지보수전산기1400000144352
22역사PSD감시1540000015449375117
23시스템<NA><NA>NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
24정위치검지장치5752482010769523271077710934
25구조체19752959280039442768208010160342424643456816