Overview

Dataset statistics

Number of variables8
Number of observations291
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.5 KiB
Average record size in memory68.5 B

Variable types

Numeric4
Text1
Categorical2
DateTime1

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11572/S/1/datasetView.do

Alerts

연번 is highly overall correlated with 호선 and 1 other fieldsHigh correlation
호선 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
길이(M) is highly overall correlated with 연번 and 2 other fieldsHigh correlation
층수 is highly overall correlated with 길이(M)High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-29 15:52:07.958929
Analysis finished2024-04-29 15:52:09.874171
Duration1.92 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct291
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean146
Minimum1
Maximum291
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:09.943355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.5
Q173.5
median146
Q3218.5
95-th percentile276.5
Maximum291
Range290
Interquartile range (IQR)145

Descriptive statistics

Standard deviation84.148678
Coefficient of variation (CV)0.57636081
Kurtosis-1.2
Mean146
Median Absolute Deviation (MAD)73
Skewness0
Sum42486
Variance7081
MonotonicityStrictly increasing
2024-04-30T00:52:10.084481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
184 1
 
0.3%
200 1
 
0.3%
199 1
 
0.3%
198 1
 
0.3%
197 1
 
0.3%
196 1
 
0.3%
195 1
 
0.3%
194 1
 
0.3%
193 1
 
0.3%
Other values (281) 281
96.6%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
291 1
0.3%
290 1
0.3%
289 1
0.3%
288 1
0.3%
287 1
0.3%
286 1
0.3%
285 1
0.3%
284 1
0.3%
283 1
0.3%
282 1
0.3%

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.862543
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:10.261544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q37
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.1739062
Coefficient of variation (CV)0.44707187
Kurtosis-1.0462745
Mean4.862543
Median Absolute Deviation (MAD)2
Skewness0.0053774411
Sum1415
Variance4.725868
MonotonicityIncreasing
2024-04-30T00:52:10.370206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
5 51
17.5%
7 51
17.5%
2 50
17.2%
6 39
13.4%
3 34
11.7%
4 26
8.9%
8 17
 
5.8%
9 13
 
4.5%
1 10
 
3.4%
ValueCountFrequency (%)
1 10
 
3.4%
2 50
17.2%
3 34
11.7%
4 26
8.9%
5 51
17.5%
6 39
13.4%
7 51
17.5%
8 17
 
5.8%
9 13
 
4.5%
ValueCountFrequency (%)
9 13
 
4.5%
8 17
 
5.8%
7 51
17.5%
6 39
13.4%
5 51
17.5%
4 26
8.9%
3 34
11.7%
2 50
17.2%
1 10
 
3.4%

역명
Text

Distinct253
Distinct (%)86.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-04-30T00:52:10.634988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.9621993
Min length2

Characters and Unicode

Total characters862
Distinct characters218
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique217 ?
Unique (%)74.6%

Sample

1st row서울
2nd row시청
3rd row종각
4th row종로3가
5th row종로5가
ValueCountFrequency (%)
동대문역사문화공원 3
 
1.0%
종로3가 3
 
1.0%
충정로 2
 
0.7%
고속터미널 2
 
0.7%
노원 2
 
0.7%
사당 2
 
0.7%
석촌 2
 
0.7%
영등포구청 2
 
0.7%
합정 2
 
0.7%
대림 2
 
0.7%
Other values (243) 269
92.4%
2024-04-30T00:52:11.027571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32
 
3.7%
29
 
3.4%
26
 
3.0%
25
 
2.9%
18
 
2.1%
16
 
1.9%
15
 
1.7%
15
 
1.7%
15
 
1.7%
15
 
1.7%
Other values (208) 656
76.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 854
99.1%
Decimal Number 8
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
3.7%
29
 
3.4%
26
 
3.0%
25
 
2.9%
18
 
2.1%
16
 
1.9%
15
 
1.8%
15
 
1.8%
15
 
1.8%
15
 
1.8%
Other values (205) 648
75.9%
Decimal Number
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 854
99.1%
Common 8
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
3.7%
29
 
3.4%
26
 
3.0%
25
 
2.9%
18
 
2.1%
16
 
1.9%
15
 
1.8%
15
 
1.8%
15
 
1.8%
15
 
1.8%
Other values (205) 648
75.9%
Common
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 854
99.1%
ASCII 8
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
32
 
3.7%
29
 
3.4%
26
 
3.0%
25
 
2.9%
18
 
2.1%
16
 
1.9%
15
 
1.8%
15
 
1.8%
15
 
1.8%
15
 
1.8%
Other values (205) 648
75.9%
ASCII
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

형식
Categorical

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
상대식
199 
섬식
78 
복합식
 
14

Length

Max length3
Median length3
Mean length2.7319588
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row섬식
2nd row상대식
3rd row상대식
4th row상대식
5th row상대식

Common Values

ValueCountFrequency (%)
상대식 199
68.4%
섬식 78
 
26.8%
복합식 14
 
4.8%

Length

2024-04-30T00:52:11.161060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:52:11.509019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상대식 199
68.4%
섬식 78
 
26.8%
복합식 14
 
4.8%

길이(M)
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean178.00687
Minimum90
Maximum210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:11.638273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum90
5-th percentile125
Q1165
median165
Q3205
95-th percentile205
Maximum210
Range120
Interquartile range (IQR)40

Descriptive statistics

Standard deviation24.402797
Coefficient of variation (CV)0.13708907
Kurtosis-0.30126288
Mean178.00687
Median Absolute Deviation (MAD)0
Skewness-0.33626231
Sum51800
Variance595.4965
MonotonicityNot monotonic
2024-04-30T00:52:11.739076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
165 157
54.0%
205 104
35.7%
125 17
 
5.8%
210 10
 
3.4%
130 2
 
0.7%
90 1
 
0.3%
ValueCountFrequency (%)
90 1
 
0.3%
125 17
 
5.8%
130 2
 
0.7%
165 157
54.0%
205 104
35.7%
210 10
 
3.4%
ValueCountFrequency (%)
210 10
 
3.4%
205 104
35.7%
165 157
54.0%
130 2
 
0.7%
125 17
 
5.8%
90 1
 
0.3%

층수
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
B2
118 
B3
83 
B4
37 
B5
17 
3F
14 
Other values (13)
22 

Length

Max length4
Median length2
Mean length2.0549828
Min length2

Unique

Unique10 ?
Unique (%)3.4%

Sample

1st rowB2
2nd rowB2
3rd rowB2
4th rowB2
5th rowB2

Common Values

ValueCountFrequency (%)
B2 118
40.5%
B3 83
28.5%
B4 37
 
12.7%
B5 17
 
5.8%
3F 14
 
4.8%
2F 7
 
2.4%
B6 3
 
1.0%
1FB3 2
 
0.7%
5FB2 1
 
0.3%
1F 1
 
0.3%
Other values (8) 8
 
2.7%

Length

2024-04-30T00:52:11.859478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b2 118
40.5%
b3 83
28.5%
b4 37
 
12.7%
b5 17
 
5.8%
3f 14
 
4.8%
2f 7
 
2.4%
b6 3
 
1.0%
1fb3 2
 
0.7%
1fb5 1
 
0.3%
2fb2 1
 
0.3%
Other values (8) 8
 
2.7%

면적(m²)
Real number (ℝ)

Distinct289
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8789.2901
Minimum1069.48
Maximum28768.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-04-30T00:52:11.983638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1069.48
5-th percentile5072.155
Q16544.455
median8138.28
Q310048.14
95-th percentile15074.237
Maximum28768.4
Range27698.92
Interquartile range (IQR)3503.685

Descriptive statistics

Standard deviation3427.0069
Coefficient of variation (CV)0.38990714
Kurtosis5.3564199
Mean8789.2901
Median Absolute Deviation (MAD)1698.06
Skewness1.6435784
Sum2557683.4
Variance11744376
MonotonicityNot monotonic
2024-04-30T00:52:12.150959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6086.0 2
 
0.7%
6439.0 2
 
0.7%
11098.900000000001 1
 
0.3%
6793.790000000001 1
 
0.3%
14457.44 1
 
0.3%
10677.550000000001 1
 
0.3%
7799.15 1
 
0.3%
9495.25 1
 
0.3%
6278.789999999999 1
 
0.3%
6545.91 1
 
0.3%
Other values (279) 279
95.9%
ValueCountFrequency (%)
1069.48 1
0.3%
1423.0 1
0.3%
1503.05 1
0.3%
1583.0 1
0.3%
2203.0 1
0.3%
3860.0 1
0.3%
4496.9400000000005 1
0.3%
4691.0 1
0.3%
4838.6 1
0.3%
4844.77 1
0.3%
ValueCountFrequency (%)
28768.4 1
0.3%
23052.81 1
0.3%
20302.8 1
0.3%
19246.0 1
0.3%
18984.55 1
0.3%
18812.649999999998 1
0.3%
18506.0 1
0.3%
18459.41 1
0.3%
18195.21 1
0.3%
17268.9 1
0.3%
Distinct61
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
Minimum1974-08-15 00:00:00
Maximum2019-10-28 00:00:00
2024-04-30T00:52:12.277608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:12.414985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-30T00:52:09.289760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.302709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.637510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.970340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.372435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.386675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.712061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.055375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.456035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.484837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.801127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.138492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.540881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.560565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:08.883131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:09.211930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T00:52:12.504547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선형식길이(M)층수면적(m²)준공연도
연번1.0000.9390.1940.8980.4760.1580.985
호선0.9391.0000.2720.8460.5290.3110.995
형식0.1940.2721.0000.0810.0000.5190.638
길이(M)0.8980.8460.0811.0000.8540.5430.976
층수0.4760.5290.0000.8541.0000.7800.919
면적(m²)0.1580.3110.5190.5430.7801.0000.512
준공연도0.9850.9950.6380.9760.9190.5121.000
2024-04-30T00:52:12.608881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층수형식
층수1.0000.000
형식0.0001.000
2024-04-30T00:52:12.698365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번호선길이(M)면적(m²)형식층수
연번1.0000.990-0.8260.1350.1210.207
호선0.9901.000-0.8220.1380.1220.199
길이(M)-0.826-0.8221.0000.0240.0600.635
면적(m²)0.1350.1380.0241.0000.2650.455
형식0.1210.1220.0600.2651.0000.000
층수0.2070.1990.6350.4550.0001.000

Missing values

2024-04-30T00:52:09.682973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T00:52:09.816753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번호선역명형식길이(M)층수면적(m²)준공연도
011서울섬식210B210805.01974-08-15
121시청상대식210B211317.01974-08-15
231종각상대식210B210410.241974-08-15
341종로3가상대식210B29311.01974-08-15
451종로5가상대식210B210465.01974-08-15
561동대문상대식210B25490.01974-08-15
671동묘앞상대식2105FB27031.662005-12-21
781신설동상대식210B27240.01974-08-15
891제기동상대식210B28662.01974-08-15
9101청량리섬식210B27125.01974-08-15
연번호선역명형식길이(M)층수면적(m²)준공연도
2812829봉은사상대식165B29825.282015-03-28
2822839종합운동장상대식165B413976.512015-03-28
2832849삼전상대식165B28644.072018-12-01
2842859석촌고분섬식165B26833.562018-12-01
2852869석촌섬식165B410105.462018-12-01
2862879송파나루섬식165B27833.292018-12-01
2872889한성백제섬식165B28954.962018-12-01
2882899올림픽공원섬식165B38372.082018-12-01
2892909둔촌오륜섬식165B27544.332018-12-01
2902919중앙보훈병원복합식165B28956.02018-12-01