Overview

Dataset statistics

Number of variables8
Number of observations296
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.1 KiB
Average record size in memory69.4 B

Variable types

Numeric5
Text1
Categorical2

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11572/S/1/datasetView.do

Alerts

연번 is highly overall correlated with 호선 and 2 other fieldsHigh correlation
호선 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
길이(M) is highly overall correlated with 연번 and 3 other fieldsHigh correlation
준공연도 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
층수 is highly overall correlated with 길이(M)High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-29 15:51:59.687658
Analysis finished2024-04-29 15:52:04.050767
Duration4.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct296
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean148.5
Minimum1
Maximum296
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:04.116171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.75
Q174.75
median148.5
Q3222.25
95-th percentile281.25
Maximum296
Range295
Interquartile range (IQR)147.5

Descriptive statistics

Standard deviation85.592056
Coefficient of variation (CV)0.57637748
Kurtosis-1.2
Mean148.5
Median Absolute Deviation (MAD)74
Skewness0
Sum43956
Variance7326
MonotonicityStrictly increasing
2024-04-30T00:52:04.262299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
205 1
 
0.3%
203 1
 
0.3%
202 1
 
0.3%
201 1
 
0.3%
200 1
 
0.3%
199 1
 
0.3%
198 1
 
0.3%
197 1
 
0.3%
196 1
 
0.3%
Other values (286) 286
96.6%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
296 1
0.3%
295 1
0.3%
294 1
0.3%
293 1
0.3%
292 1
0.3%
291 1
0.3%
290 1
0.3%
289 1
0.3%
288 1
0.3%
287 1
0.3%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8648649
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:04.369083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.1554775
Coefficient of variation (CV)0.44307038
Kurtosis-1.0125135
Mean4.8648649
Median Absolute Deviation (MAD)2
Skewness0.0021732148
Sum1440
Variance4.6460834
MonotonicityIncreasing
2024-04-30T00:52:04.462872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
5 56
18.9%
7 51
17.2%
2 50
16.9%
6 39
13.2%
3 34
11.5%
4 26
8.8%
8 17
 
5.7%
9 13
 
4.4%
1 10
 
3.4%
ValueCountFrequency (%)
1 10
 
3.4%
2 50
16.9%
3 34
11.5%
4 26
8.8%
5 56
18.9%
6 39
13.2%
7 51
17.2%
8 17
 
5.7%
9 13
 
4.4%
ValueCountFrequency (%)
9 13
 
4.4%
8 17
 
5.7%
7 51
17.2%
6 39
13.2%
5 56
18.9%
4 26
8.8%
3 34
11.5%
2 50
16.9%
1 10
 
3.4%

역명
Text

Distinct258
Distinct (%)87.2%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-04-30T00:52:04.708840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.9695946
Min length2

Characters and Unicode

Total characters879
Distinct characters219
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique222 ?
Unique (%)75.0%

Sample

1st row서울
2nd row시청
3rd row종각
4th row종로3가
5th row종로5가
ValueCountFrequency (%)
종로3가 3
 
1.0%
동대문역사문화공원 3
 
1.0%
석촌 2
 
0.7%
서울 2
 
0.7%
충무로 2
 
0.7%
충정로 2
 
0.7%
시청 2
 
0.7%
교대 2
 
0.7%
사당 2
 
0.7%
공덕 2
 
0.7%
Other values (248) 274
92.6%
2024-04-30T00:52:05.104679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32
 
3.6%
29
 
3.3%
26
 
3.0%
25
 
2.8%
20
 
2.3%
16
 
1.8%
16
 
1.8%
15
 
1.7%
15
 
1.7%
15
 
1.7%
Other values (209) 670
76.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 871
99.1%
Decimal Number 8
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
3.7%
29
 
3.3%
26
 
3.0%
25
 
2.9%
20
 
2.3%
16
 
1.8%
16
 
1.8%
15
 
1.7%
15
 
1.7%
15
 
1.7%
Other values (206) 662
76.0%
Decimal Number
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 871
99.1%
Common 8
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
3.7%
29
 
3.3%
26
 
3.0%
25
 
2.9%
20
 
2.3%
16
 
1.8%
16
 
1.8%
15
 
1.7%
15
 
1.7%
15
 
1.7%
Other values (206) 662
76.0%
Common
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 871
99.1%
ASCII 8
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
32
 
3.7%
29
 
3.3%
26
 
3.0%
25
 
2.9%
20
 
2.3%
16
 
1.8%
16
 
1.8%
15
 
1.7%
15
 
1.7%
15
 
1.7%
Other values (206) 662
76.0%
ASCII
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

형식
Categorical

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
상대식
204 
섬식
78 
복합식
 
14

Length

Max length3
Median length3
Mean length2.7364865
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row섬식
2nd row상대식
3rd row상대식
4th row상대식
5th row상대식

Common Values

ValueCountFrequency (%)
상대식 204
68.9%
섬식 78
 
26.4%
복합식 14
 
4.7%

Length

2024-04-30T00:52:05.248509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:52:05.345769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상대식 204
68.9%
섬식 78
 
26.4%
복합식 14
 
4.7%

길이(M)
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean177.78716
Minimum90
Maximum210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:05.476891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum90
5-th percentile125
Q1165
median165
Q3205
95-th percentile205
Maximum210
Range120
Interquartile range (IQR)40

Descriptive statistics

Standard deviation24.253296
Coefficient of variation (CV)0.13641759
Kurtosis-0.29170714
Mean177.78716
Median Absolute Deviation (MAD)0
Skewness-0.31194373
Sum52625
Variance588.22234
MonotonicityNot monotonic
2024-04-30T00:52:05.605377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
165 162
54.7%
205 104
35.1%
125 17
 
5.7%
210 10
 
3.4%
130 2
 
0.7%
90 1
 
0.3%
ValueCountFrequency (%)
90 1
 
0.3%
125 17
 
5.7%
130 2
 
0.7%
165 162
54.7%
205 104
35.1%
210 10
 
3.4%
ValueCountFrequency (%)
210 10
 
3.4%
205 104
35.1%
165 162
54.7%
130 2
 
0.7%
125 17
 
5.7%
90 1
 
0.3%

층수
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
B2
121 
B3
85 
B4
37 
B5
17 
3F
14 
Other values (13)
22 

Length

Max length4
Median length2
Mean length2.0540541
Min length2

Unique

Unique10 ?
Unique (%)3.4%

Sample

1st rowB2
2nd rowB2
3rd rowB2
4th rowB2
5th rowB2

Common Values

ValueCountFrequency (%)
B2 121
40.9%
B3 85
28.7%
B4 37
 
12.5%
B5 17
 
5.7%
3F 14
 
4.7%
2F 7
 
2.4%
B6 3
 
1.0%
1FB3 2
 
0.7%
5FB2 1
 
0.3%
1F 1
 
0.3%
Other values (8) 8
 
2.7%

Length

2024-04-30T00:52:05.752617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b2 121
40.9%
b3 85
28.7%
b4 37
 
12.5%
b5 17
 
5.7%
3f 14
 
4.7%
2f 7
 
2.4%
b6 3
 
1.0%
1fb3 2
 
0.7%
1fb5 1
 
0.3%
2fb2 1
 
0.3%
Other values (8) 8
 
2.7%

면적(m²)
Real number (ℝ)

Distinct294
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8825.7663
Minimum1069.5
Maximum28768.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:05.876472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1069.5
5-th percentile5074.575
Q16552.05
median8165.95
Q310087.35
95-th percentile14952.075
Maximum28768.4
Range27698.9
Interquartile range (IQR)3535.3

Descriptive statistics

Standard deviation3419.0103
Coefficient of variation (CV)0.38738962
Kurtosis5.24256
Mean8825.7663
Median Absolute Deviation (MAD)1707.2
Skewness1.6113676
Sum2612426.8
Variance11689631
MonotonicityNot monotonic
2024-04-30T00:52:06.214916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6086.0 2
 
0.7%
6439.0 2
 
0.7%
6587.6 1
 
0.3%
7085.2 1
 
0.3%
14000.6 1
 
0.3%
18195.2 1
 
0.3%
5805.9 1
 
0.3%
9093.3 1
 
0.3%
6278.8 1
 
0.3%
10805.0 1
 
0.3%
Other values (284) 284
95.9%
ValueCountFrequency (%)
1069.5 1
0.3%
1423.0 1
0.3%
1503.1 1
0.3%
1583.0 1
0.3%
2203.0 1
0.3%
3860.0 1
0.3%
4496.9 1
0.3%
4691.0 1
0.3%
4838.6 1
0.3%
4844.8 1
0.3%
ValueCountFrequency (%)
28768.4 1
0.3%
23052.8 1
0.3%
20302.8 1
0.3%
19246.0 1
0.3%
18984.6 1
0.3%
18812.7 1
0.3%
18506.0 1
0.3%
18459.4 1
0.3%
18195.2 1
0.3%
17268.9 1
0.3%

준공연도
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1994.5135
Minimum1974
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:06.330186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1974
5-th percentile1980
Q11985
median1996
Q32001
95-th percentile2015
Maximum2022
Range48
Interquartile range (IQR)16

Descriptive statistics

Standard deviation10.378587
Coefficient of variation (CV)0.0052035682
Kurtosis0.022117884
Mean1994.5135
Median Absolute Deviation (MAD)5
Skewness0.3904303
Sum590376
Variance107.71507
MonotonicityNot monotonic
2024-04-30T00:52:06.445319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1985 47
15.9%
1996 46
15.5%
2001 41
13.9%
1997 26
8.8%
2000 19
 
6.4%
1984 16
 
5.4%
1983 14
 
4.7%
1995 12
 
4.1%
1980 11
 
3.7%
2012 9
 
3.0%
Other values (16) 55
18.6%
ValueCountFrequency (%)
1974 9
 
3.0%
1980 11
 
3.7%
1982 5
 
1.7%
1983 14
 
4.7%
1984 16
 
5.4%
1985 47
15.9%
1990 1
 
0.3%
1992 2
 
0.7%
1993 8
 
2.7%
1994 1
 
0.3%
ValueCountFrequency (%)
2022 1
 
0.3%
2021 2
 
0.7%
2020 2
 
0.7%
2019 1
 
0.3%
2018 8
2.7%
2015 5
1.7%
2012 9
3.0%
2010 3
 
1.0%
2005 2
 
0.7%
2002 1
 
0.3%

Interactions

2024-04-30T00:52:03.491850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:01.755482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.308803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.709354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.090018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.571310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:01.922507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.386555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.781322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.171524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.662964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.014396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.476782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.859699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.254187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.734401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.124584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.556236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.943562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.328944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.802775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.218655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:02.633510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.015203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:03.413569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T00:52:06.522070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선형식길이(M)층수면적(m²)준공연도
연번1.0000.9470.2580.9080.5030.1520.823
호선0.9471.0000.2730.8460.5200.3140.951
형식0.2580.2731.0000.0770.0000.5170.335
길이(M)0.9080.8460.0771.0000.8530.5450.781
층수0.5030.5200.0000.8531.0000.7790.804
면적(m²)0.1520.3140.5170.5450.7791.0000.325
준공연도0.8230.9510.3350.7810.8040.3251.000
2024-04-30T00:52:06.623044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층수형식
층수1.0000.000
형식0.0001.000
2024-04-30T00:52:06.715407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선길이(M)면적(m²)준공연도형식층수
연번1.0000.989-0.8240.1350.8310.1580.212
호선0.9891.000-0.8210.1360.8120.1230.195
길이(M)-0.824-0.8211.0000.014-0.7230.0570.634
면적(m²)0.1350.1360.0141.0000.1950.2640.454
준공연도0.8310.812-0.7230.1951.0000.1450.376
형식0.1580.1230.0570.2640.1451.0000.000
층수0.2120.1950.6340.4540.3760.0001.000

Missing values

2024-04-30T00:52:03.899575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T00:52:04.009611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선역명형식길이(M)층수면적(m²)준공연도
011서울섬식210B210805.01974
121시청상대식210B211317.01974
231종각상대식210B210410.21974
341종로3가상대식210B29311.01974
451종로5가상대식210B210465.01974
561동대문상대식210B25490.01974
671동묘앞상대식2105FB27031.72005
781신설동상대식210B27240.01974
891제기동상대식210B28662.01974
9101청량리섬식210B27125.01974
연번호선역명형식길이(M)층수면적(m²)준공연도
2862879봉은사상대식165B29825.32015
2872889종합운동장상대식165B413976.52015
2882899삼전상대식165B28644.12018
2892909석촌고분섬식165B26833.62018
2902919석촌섬식165B410105.52018
2912929송파나루섬식165B27833.32018
2922939한성백제섬식165B28955.02018
2932949올림픽공원섬식165B38372.12018
2942959둔촌오륜섬식165B27544.32018
2952969중앙보훈병원복합식165B28956.02018