Overview

Dataset statistics

Number of variables10
Number of observations47
Missing cells2
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 KiB
Average record size in memory85.8 B

Variable types

Categorical7
Text1
Numeric2

Dataset

Description수도권1호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 휠체어리프트의 관리번호, 상세위치, 길이, 폭, 시작층, 종료층의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041428/fileData.do

Alerts

선명 has constant value ""Constant
철도운영기관명 is highly overall correlated with 길이 and 4 other fieldsHigh correlation
종료층 is highly overall correlated with 철도운영기관명 and 2 other fieldsHigh correlation
출입구번호 is highly overall correlated with 철도운영기관명High correlation
길이 is highly overall correlated with and 2 other fieldsHigh correlation
is highly overall correlated with 길이 and 1 other fieldsHigh correlation
역명 is highly overall correlated with 길이 and 4 other fieldsHigh correlation
시작층 is highly overall correlated with 철도운영기관명 and 2 other fieldsHigh correlation
길이 has 1 (2.1%) missing valuesMissing
has 1 (2.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:56:11.910353
Analysis finished2023-12-12 13:56:13.052301
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size508.0 B
코레일
39 
서울교통공사

Length

Max length6
Median length3
Mean length3.5106383
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울교통공사
2nd row서울교통공사
3rd row서울교통공사
4th row서울교통공사
5th row서울교통공사

Common Values

ValueCountFrequency (%)
코레일 39
83.0%
서울교통공사 8
 
17.0%

Length

2023-12-12T22:56:13.139317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:13.291095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 39
83.0%
서울교통공사 8
 
17.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size508.0 B
1호선
47 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 47
100.0%

Length

2023-12-12T22:56:13.385931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:13.472117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 47
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)46.8%
Missing0
Missing (%)0.0%
Memory size508.0 B
구일
신설동
구로
노량진
신이문
Other values (17)
26 

Length

Max length12
Median length7
Mean length3
Min length2

Unique

Unique9 ?
Unique (%)19.1%

Sample

1st row서울역
2nd row신설동
3rd row신설동
4th row신설동
5th row신설동

Common Values

ValueCountFrequency (%)
구일 5
 
10.6%
신설동 5
 
10.6%
구로 4
 
8.5%
노량진 4
 
8.5%
신이문 3
 
6.4%
영등포 3
 
6.4%
군포 2
 
4.3%
석수 2
 
4.3%
망월사 2
 
4.3%
서정리 2
 
4.3%
Other values (12) 15
31.9%

Length

2023-12-12T22:56:13.608796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구일 5
 
10.6%
신설동 5
 
10.6%
구로 4
 
8.5%
노량진 4
 
8.5%
신이문 3
 
6.4%
영등포 3
 
6.4%
서정리 2
 
4.3%
평택 2
 
4.3%
오산 2
 
4.3%
청량리(서울시립대입구 2
 
4.3%
Other values (12) 15
31.9%
Distinct5
Distinct (%)10.6%
Missing0
Missing (%)0.0%
Memory size508.0 B
1
22 
2
13 
3
4
5
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row3
5th row4

Common Values

ValueCountFrequency (%)
1 22
46.8%
2 13
27.7%
3 6
 
12.8%
4 4
 
8.5%
5 2
 
4.3%

Length

2023-12-12T22:56:13.735618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:13.884619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 22
46.8%
2 13
27.7%
3 6
 
12.8%
4 4
 
8.5%
5 2
 
4.3%

출입구번호
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Memory size508.0 B
<NA>
20 
1
14 
2
3
02월 03일
 
2

Length

Max length7
Median length4
Mean length2.787234
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 20
42.6%
1 14
29.8%
2 6
 
12.8%
3 3
 
6.4%
02월 03일 2
 
4.3%
01월 02일 2
 
4.3%

Length

2023-12-12T22:56:14.016056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:14.118169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 20
39.2%
1 14
27.5%
2 6
 
11.8%
3 3
 
5.9%
02월 2
 
3.9%
03일 2
 
3.9%
01월 2
 
3.9%
02일 2
 
3.9%
Distinct46
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size508.0 B
2023-12-12T22:56:14.372736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length22
Mean length18.531915
Min length4

Characters and Unicode

Total characters871
Distinct characters109
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)95.7%

Sample

1st row(B2)내부 C 계단
2nd row(F1)6번 출입구
3rd row(B2)상선승강장 시점측
4th row(B2)하선승강장 시점측
5th row(B1)대합실 연결통로
ValueCountFrequency (%)
승강장 19
 
9.1%
방향 16
 
7.7%
1f 14
 
6.7%
계단 12
 
5.8%
7
 
3.4%
1층 6
 
2.9%
입구 6
 
2.9%
표내는 5
 
2.4%
맞이방 5
 
2.4%
1번출입구 5
 
2.4%
Other values (74) 113
54.3%
2023-12-12T22:56:14.798739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
169
19.4%
( 41
 
4.7%
) 41
 
4.7%
1 40
 
4.6%
35
 
4.0%
30
 
3.4%
29
 
3.3%
29
 
3.3%
F 24
 
2.8%
2 18
 
2.1%
Other values (99) 415
47.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 483
55.5%
Space Separator 169
 
19.4%
Decimal Number 80
 
9.2%
Open Punctuation 41
 
4.7%
Close Punctuation 41
 
4.7%
Uppercase Letter 36
 
4.1%
Dash Punctuation 10
 
1.1%
Other Punctuation 9
 
1.0%
Math Symbol 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
7.2%
30
 
6.2%
29
 
6.0%
29
 
6.0%
18
 
3.7%
17
 
3.5%
16
 
3.3%
16
 
3.3%
14
 
2.9%
13
 
2.7%
Other values (79) 266
55.1%
Decimal Number
ValueCountFrequency (%)
1 40
50.0%
2 18
22.5%
3 6
 
7.5%
4 6
 
7.5%
0 2
 
2.5%
5 2
 
2.5%
6 2
 
2.5%
8 2
 
2.5%
9 1
 
1.2%
7 1
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
F 24
66.7%
B 9
 
25.0%
C 2
 
5.6%
A 1
 
2.8%
Space Separator
ValueCountFrequency (%)
169
100.0%
Open Punctuation
ValueCountFrequency (%)
( 41
100.0%
Close Punctuation
ValueCountFrequency (%)
) 41
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 9
100.0%
Math Symbol
ValueCountFrequency (%)
> 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 483
55.5%
Common 352
40.4%
Latin 36
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
7.2%
30
 
6.2%
29
 
6.0%
29
 
6.0%
18
 
3.7%
17
 
3.5%
16
 
3.3%
16
 
3.3%
14
 
2.9%
13
 
2.7%
Other values (79) 266
55.1%
Common
ValueCountFrequency (%)
169
48.0%
( 41
 
11.6%
) 41
 
11.6%
1 40
 
11.4%
2 18
 
5.1%
- 10
 
2.8%
/ 9
 
2.6%
3 6
 
1.7%
4 6
 
1.7%
> 2
 
0.6%
Other values (6) 10
 
2.8%
Latin
ValueCountFrequency (%)
F 24
66.7%
B 9
 
25.0%
C 2
 
5.6%
A 1
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 483
55.5%
ASCII 388
44.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
169
43.6%
( 41
 
10.6%
) 41
 
10.6%
1 40
 
10.3%
F 24
 
6.2%
2 18
 
4.6%
- 10
 
2.6%
B 9
 
2.3%
/ 9
 
2.3%
3 6
 
1.5%
Other values (10) 21
 
5.4%
Hangul
ValueCountFrequency (%)
35
 
7.2%
30
 
6.2%
29
 
6.0%
29
 
6.0%
18
 
3.7%
17
 
3.5%
16
 
3.3%
16
 
3.3%
14
 
2.9%
13
 
2.7%
Other values (79) 266
55.1%

길이
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)30.4%
Missing1
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean962.82609
Minimum5
Maximum3000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size555.0 B
2023-12-12T22:56:14.920442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile16.25
Q1125
median1250
Q31250
95-th percentile1400
Maximum3000
Range2995
Interquartile range (IQR)1125

Descriptive statistics

Standard deviation620.53709
Coefficient of variation (CV)0.64449551
Kurtosis1.1557214
Mean962.82609
Median Absolute Deviation (MAD)75
Skewness0.058488535
Sum44290
Variance385066.28
MonotonicityNot monotonic
2023-12-12T22:56:15.026741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
1250 17
36.2%
125 8
17.0%
1400 5
 
10.6%
1200 4
 
8.5%
1350 2
 
4.3%
1270 2
 
4.3%
3000 1
 
2.1%
1100 1
 
2.1%
15 1
 
2.1%
5 1
 
2.1%
Other values (4) 4
 
8.5%
ValueCountFrequency (%)
5 1
 
2.1%
10 1
 
2.1%
15 1
 
2.1%
20 1
 
2.1%
50 1
 
2.1%
125 8
17.0%
800 1
 
2.1%
1100 1
 
2.1%
1200 4
 
8.5%
1250 17
36.2%
ValueCountFrequency (%)
3000 1
 
2.1%
1400 5
 
10.6%
1350 2
 
4.3%
1270 2
 
4.3%
1250 17
36.2%
1200 4
 
8.5%
1100 1
 
2.1%
800 1
 
2.1%
125 8
17.0%
50 1
 
2.1%


Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8
Distinct (%)17.4%
Missing1
Missing (%)2.1%
Infinite0
Infinite (%)0.0%
Mean475.76087
Minimum80
Maximum1200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size555.0 B
2023-12-12T22:56:15.137175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum80
5-th percentile80
Q182.25
median460
Q3800
95-th percentile950
Maximum1200
Range1120
Interquartile range (IQR)717.75

Descriptive statistics

Standard deviation392.17871
Coefficient of variation (CV)0.82431897
Kurtosis-1.7287706
Mean475.76087
Median Absolute Deviation (MAD)340
Skewness0.16779064
Sum21885
Variance153804.14
MonotonicityNot monotonic
2023-12-12T22:56:15.250169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
800 18
38.3%
80 11
23.4%
120 9
19.1%
950 3
 
6.4%
1200 2
 
4.3%
110 1
 
2.1%
83 1
 
2.1%
82 1
 
2.1%
(Missing) 1
 
2.1%
ValueCountFrequency (%)
80 11
23.4%
82 1
 
2.1%
83 1
 
2.1%
110 1
 
2.1%
120 9
19.1%
800 18
38.3%
950 3
 
6.4%
1200 2
 
4.3%
ValueCountFrequency (%)
1200 2
 
4.3%
950 3
 
6.4%
800 18
38.3%
120 9
19.1%
110 1
 
2.1%
83 1
 
2.1%
82 1
 
2.1%
80 11
23.4%

시작층
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)10.6%
Missing0
Missing (%)0.0%
Memory size508.0 B
지상1
28 
지상3
지하2
지하1
지상2

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하2
2nd row지상1
3rd row지하2
4th row지하2
5th row지하1

Common Values

ValueCountFrequency (%)
지상1 28
59.6%
지상3 6
 
12.8%
지하2 5
 
10.6%
지하1 4
 
8.5%
지상2 4
 
8.5%

Length

2023-12-12T22:56:15.382868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:15.487518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상1 28
59.6%
지상3 6
 
12.8%
지하2 5
 
10.6%
지하1 4
 
8.5%
지상2 4
 
8.5%

종료층
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Memory size508.0 B
지상1
18 
지상2
14 
지하1
지상3

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하1
2nd row지하1
3rd row지하1
4th row지하1
5th row지하1

Common Values

ValueCountFrequency (%)
지상1 18
38.3%
지상2 14
29.8%
지하1 9
19.1%
지상3 6
 
12.8%

Length

2023-12-12T22:56:15.601420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:56:15.698374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지상1 18
38.3%
지상2 14
29.8%
지하1 9
19.1%
지상3 6
 
12.8%

Interactions

2023-12-12T22:56:12.529050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:56:12.376522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:56:12.608854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:56:12.457190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:56:15.774440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
철도운영기관명1.0001.0000.000NaN1.0000.5800.5740.6890.993
역명1.0001.0000.0000.8011.0001.0001.0000.9240.982
휠체어리프트의 관리번호0.0000.0001.0000.0000.0000.0000.0000.0000.000
출입구번호NaN0.8010.0001.0001.0000.7610.1320.3230.485
상세위치1.0001.0000.0001.0001.0001.0001.0001.0001.000
길이0.5801.0000.0000.7611.0001.0000.4390.4910.476
0.5741.0000.0000.1321.0000.4391.0000.0000.659
시작층0.6890.9240.0000.3231.0000.4910.0001.0000.612
종료층0.9930.9820.0000.4851.0000.4760.6590.6121.000
2023-12-12T22:56:15.907233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시작층휠체어리프트의 관리번호철도운영기관명역명종료층출입구번호
시작층1.0000.0000.7940.5900.5330.251
휠체어리프트의 관리번호0.0001.0000.0000.0000.0000.000
철도운영기관명0.7940.0001.0000.7450.9051.000
역명0.5900.0000.7451.0000.7090.464
종료층0.5330.0000.9050.7091.0000.395
출입구번호0.2510.0001.0000.4640.3951.000
2023-12-12T22:56:16.025226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
길이철도운영기관명역명휠체어리프트의 관리번호출입구번호시작층종료층
길이1.0000.5330.6750.7810.0000.3740.1970.399
0.5331.0000.3840.7720.0000.1280.0000.309
철도운영기관명0.6750.3841.0000.7450.0001.0000.7940.905
역명0.7810.7720.7451.0000.0000.4640.5900.709
휠체어리프트의 관리번호0.0000.0000.0000.0001.0000.0000.0000.000
출입구번호0.3740.1281.0000.4640.0001.0000.2510.395
시작층0.1970.0000.7940.5900.0000.2511.0000.533
종료층0.3990.3090.9050.7090.0000.3950.5331.000

Missing values

2023-12-12T22:56:12.727078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:56:12.874335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:56:12.993371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
0서울교통공사1호선서울역1<NA>(B2)내부 C 계단12580지하2지하1
1서울교통공사1호선신설동1<NA>(F1)6번 출입구12580지상1지하1
2서울교통공사1호선신설동2<NA>(B2)상선승강장 시점측12580지하2지하1
3서울교통공사1호선신설동3<NA>(B2)하선승강장 시점측12580지하2지하1
4서울교통공사1호선신설동4<NA>(B1)대합실 연결통로12580지하1지하1
5서울교통공사1호선신설동5<NA>(B1)대합실 연결통로12580지하1지하1
6서울교통공사1호선청량리(서울시립대입구)1<NA>(B2)제기동측 승강장12580지하2지하1
7서울교통공사1호선청량리(서울시립대입구)2<NA>(B2)섬식(상)8-2125110지하2지하1
8코레일1호선가산디지털단지11(2F) 1번출구 옆3000120지상1지상2
9코레일1호선관악1<NA>1F(석수역 방향 상행승강장 계단 옆)1100800지상1지상2
철도운영기관명선명역명휠체어리프트의 관리번호출입구번호상세위치길이시작층종료층
37코레일1호선신이문301월 02일(1F) 외대앞역 승강장 10-4 앞1200120지상1지상2
38코레일1호선아산11CA웨딩홀 방면 환승통로<NA><NA>지상1지상2
39코레일1호선영등포1<NA>(1F) 수원역(기차) 방향 승강장 남쪽1400950지상1지상1
40코레일1호선영등포2<NA>(1F) 수원역(기차) 방향 승강장 북쪽1400950지상1지상1
41코레일1호선영등포3<NA>(1F) 용산역(기차) 방향 승강장 남쪽1400950지상1지상1
42코레일1호선오산11국철 하행승강장 내려가는 입구1250800지상1지상1
43코레일1호선오산22국철 상행승강장 내려가는 입구1250800지상2지상2
44코레일1호선외대앞11(1F) 신이문역 방향 승강장 남쪽 계단 앞1250800지상1지상3
45코레일1호선평택1<NA>(1F) 국철 상선 2/ 3호차 타는곳14001200지상1지상1
46코레일1호선평택2<NA>(1F) 국철 하선 1/ 2호차 타는곳14001200지상1지상1