Overview

Dataset statistics

Number of variables7
Number of observations515
Missing cells515
Missing cells (%)14.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory29.3 KiB
Average record size in memory58.3 B

Variable types

Categorical2
Text3
Numeric1
DateTime1

Dataset

Description파일 다운로드
Author서울 교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13242/F/1/datasetView.do

Alerts

비 고 has constant value ""Constant
비 고 has 513 (99.6%) missing valuesMissing

Reproduction

Analysis started2023-12-11 04:43:01.619220
Analysis finished2023-12-11 04:43:02.767591
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2
203 
3
180 
4
103 
1
29 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 203
39.4%
3 180
35.0%
4 103
20.0%
1 29
 
5.6%

Length

2023-12-11T13:43:02.842514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T13:43:02.989606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 203
39.4%
3 180
35.0%
4 103
20.0%
1 29
 
5.6%
Distinct97
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-11T13:43:03.365818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length10
Mean length3.4524272
Min length2

Characters and Unicode

Total characters1778
Distinct characters133
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row서울역(1)
2nd row서울역(1)
3rd row서울역(1)
4th row시청
5th row시청
ValueCountFrequency (%)
고속터미널 24
 
4.7%
가락시장 22
 
4.3%
충무로 22
 
4.3%
경찰병원 20
 
3.9%
오금 15
 
2.9%
동묘앞 12
 
2.3%
용두 12
 
2.3%
역삼 12
 
2.3%
사당(2 12
 
2.3%
신천 11
 
2.1%
Other values (87) 353
68.5%
2023-12-11T13:43:03.911973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
86
 
4.8%
( 73
 
4.1%
) 73
 
4.1%
54
 
3.0%
50
 
2.8%
48
 
2.7%
48
 
2.7%
40
 
2.2%
37
 
2.1%
35
 
2.0%
Other values (123) 1234
69.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1535
86.3%
Decimal Number 91
 
5.1%
Open Punctuation 73
 
4.1%
Close Punctuation 73
 
4.1%
Uppercase Letter 4
 
0.2%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
86
 
5.6%
54
 
3.5%
50
 
3.3%
48
 
3.1%
48
 
3.1%
40
 
2.6%
37
 
2.4%
35
 
2.3%
32
 
2.1%
31
 
2.0%
Other values (114) 1074
70.0%
Decimal Number
ValueCountFrequency (%)
4 31
34.1%
3 26
28.6%
2 22
24.2%
1 12
 
13.2%
Uppercase Letter
ValueCountFrequency (%)
W 2
50.0%
M 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 73
100.0%
Close Punctuation
ValueCountFrequency (%)
) 73
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1535
86.3%
Common 239
 
13.4%
Latin 4
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
86
 
5.6%
54
 
3.5%
50
 
3.3%
48
 
3.1%
48
 
3.1%
40
 
2.6%
37
 
2.4%
35
 
2.3%
32
 
2.1%
31
 
2.0%
Other values (114) 1074
70.0%
Common
ValueCountFrequency (%)
( 73
30.5%
) 73
30.5%
4 31
13.0%
3 26
 
10.9%
2 22
 
9.2%
1 12
 
5.0%
/ 2
 
0.8%
Latin
ValueCountFrequency (%)
W 2
50.0%
M 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1535
86.3%
ASCII 243
 
13.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
86
 
5.6%
54
 
3.5%
50
 
3.3%
48
 
3.1%
48
 
3.1%
40
 
2.6%
37
 
2.4%
35
 
2.3%
32
 
2.1%
31
 
2.0%
Other values (114) 1074
70.0%
ASCII
ValueCountFrequency (%)
( 73
30.0%
) 73
30.0%
4 31
12.8%
3 26
 
10.7%
2 22
 
9.1%
1 12
 
4.9%
W 2
 
0.8%
/ 2
 
0.8%
M 2
 
0.8%

호기
Real number (ℝ)

Distinct24
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.1145631
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-11T13:43:04.109358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37
95-th percentile16
Maximum24
Range23
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.6970975
Coefficient of variation (CV)0.91837708
Kurtosis2.7921785
Mean5.1145631
Median Absolute Deviation (MAD)2
Skewness1.7287941
Sum2634
Variance22.062725
MonotonicityNot monotonic
2023-12-11T13:43:04.297469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 97
18.8%
2 95
18.4%
3 64
12.4%
4 57
11.1%
5 35
 
6.8%
6 35
 
6.8%
7 22
 
4.3%
8 21
 
4.1%
9 14
 
2.7%
10 13
 
2.5%
Other values (14) 62
12.0%
ValueCountFrequency (%)
1 97
18.8%
2 95
18.4%
3 64
12.4%
4 57
11.1%
5 35
 
6.8%
6 35
 
6.8%
7 22
 
4.3%
8 21
 
4.1%
9 14
 
2.7%
10 13
 
2.5%
ValueCountFrequency (%)
24 1
 
0.2%
23 1
 
0.2%
22 3
0.6%
21 3
0.6%
20 4
0.8%
19 4
0.8%
18 4
0.8%
17 4
0.8%
16 4
0.8%
15 5
1.0%

설치위치
Categorical

Distinct41
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
내부
199 
1번출구
52 
2번출구
48 
3번출구
44 
5번출구
35 
Other values (36)
137 

Length

Max length13
Median length10
Mean length3.5087379
Min length2

Unique

Unique8 ?
Unique (%)1.6%

Sample

1st row4호선연결통로
2nd row내부
3rd row내부
4th row5번출구
5th row5번출구

Common Values

ValueCountFrequency (%)
내부 199
38.6%
1번출구 52
 
10.1%
2번출구 48
 
9.3%
3번출구 44
 
8.5%
5번출구 35
 
6.8%
4번출구 33
 
6.4%
6번출구 18
 
3.5%
8번출구 15
 
2.9%
14번출구 6
 
1.2%
9번출구 6
 
1.2%
Other values (31) 59
 
11.5%

Length

2023-12-11T13:43:04.469756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
내부 201
37.8%
1번출구 52
 
9.8%
2번출구 48
 
9.0%
3번출구 44
 
8.3%
5번출구 35
 
6.6%
4번출구 33
 
6.2%
6번출구 20
 
3.8%
8번출구 15
 
2.8%
14번출구 6
 
1.1%
9번출구 6
 
1.1%
Other values (33) 72
 
13.5%
Distinct512
Distinct (%)99.8%
Missing2
Missing (%)0.4%
Memory size4.2 KiB
2023-12-11T13:43:04.911240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9980507
Min length7

Characters and Unicode

Total characters4103
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique511 ?
Unique (%)99.6%

Sample

1st row1800-448
2nd row1806-600
3rd row1806-599
4th row1808-119
5th row1808-120
ValueCountFrequency (%)
1803-789 2
 
0.4%
1803-452 1
 
0.2%
1800-448 1
 
0.2%
1802-117 1
 
0.2%
1802-056 1
 
0.2%
1802-055 1
 
0.2%
1802-120 1
 
0.2%
1802-119 1
 
0.2%
1811-190 1
 
0.2%
1811-189 1
 
0.2%
Other values (502) 502
97.9%
2023-12-11T13:43:05.478234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 777
18.9%
8 751
18.3%
0 748
18.2%
- 513
12.5%
9 249
 
6.1%
3 224
 
5.5%
2 197
 
4.8%
7 179
 
4.4%
5 169
 
4.1%
4 157
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3590
87.5%
Dash Punctuation 513
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 777
21.6%
8 751
20.9%
0 748
20.8%
9 249
 
6.9%
3 224
 
6.2%
2 197
 
5.5%
7 179
 
5.0%
5 169
 
4.7%
4 157
 
4.4%
6 139
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 513
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4103
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 777
18.9%
8 751
18.3%
0 748
18.2%
- 513
12.5%
9 249
 
6.1%
3 224
 
5.5%
2 197
 
4.8%
7 179
 
4.4%
5 169
 
4.1%
4 157
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4103
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 777
18.9%
8 751
18.3%
0 748
18.2%
- 513
12.5%
9 249
 
6.1%
3 224
 
5.5%
2 197
 
4.8%
7 179
 
4.4%
5 169
 
4.1%
4 157
 
3.8%
Distinct167
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
Minimum1984-04-27 00:00:00
Maximum2015-12-15 00:00:00
2023-12-11T13:43:05.662204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T13:43:05.836907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

비 고
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)50.0%
Missing513
Missing (%)99.6%
Memory size4.2 KiB
2023-12-11T13:43:05.988889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row휴지중
2nd row휴지중
ValueCountFrequency (%)
휴지중 2
100.0%
2023-12-11T13:43:06.293424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2
33.3%
2
33.3%
2
33.3%

Interactions

2023-12-11T13:43:02.244489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T13:43:06.413853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선역사명호기설치위치
호선1.0001.0000.2210.493
역사명1.0001.0000.0000.956
호기0.2210.0001.0000.346
설치위치0.4930.9560.3461.000
2023-12-11T13:43:06.509068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선설치위치
호선1.0000.261
설치위치0.2611.000
2023-12-11T13:43:06.600341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호기호선설치위치
호기1.0000.1330.121
호선0.1331.0000.261
설치위치0.1210.2611.000

Missing values

2023-12-11T13:43:02.398498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:43:02.560604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T13:43:02.699442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

호선역사명호기설치위치승강기번호설치일자비 고
01서울역(1)14호선연결통로1800-4482000-05-06<NA>
11서울역(1)2내부1806-6002014-04-28<NA>
21서울역(1)3내부1806-5992014-07-04<NA>
31시청15번출구1808-1192011-11-04<NA>
41시청25번출구1808-1202011-11-04<NA>
51시청31번출구0809-2392014-06-24<NA>
61종로3가(1)11번출구1806-4322013-11-25<NA>
71종로3가(1)21번출구1806-4312013-11-25<NA>
81종로3가(1)314번출구1806-5972014-07-16<NA>
91종로3가(1)414번출구1806-5982014-07-16<NA>
호선역사명호기설치위치승강기번호설치일자비 고
5054총신대입구214번출구1807-7142014-07-18<NA>
5064사당(4)1내부1804-7152007-07-25<NA>
5074사당(4)24번출구1803-2822004-04-24<NA>
5084사당(4)34번출구1808-2812004-04-24<NA>
5094사당(4)49번출구1804-7142004-07-30<NA>
5104사당(4)59번출구1804-7132004-07-30<NA>
5114사당(4)63번출구1807-3012012-09-24<NA>
5124사당(4)73번출구1808-2832009-02-16<NA>
5134남태령1내부1802-1142015-11-23<NA>
5144남태령2내부1802-1132015-11-23<NA>