Overview

Dataset statistics

Number of variables6
Number of observations67
Missing cells50
Missing cells (%)12.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory52.0 B

Variable types

Categorical4
Numeric2

Dataset

Description부산교통공사에 운영하는 부산지하철 3호선의 역구조에 대한 데이터로, 철도운영기관명, 선명, 역명, 지상구분, 역층, 역층구분, 면적의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041137/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
역층 is highly overall correlated with 면적High correlation
면적 is highly overall correlated with 역층 and 1 other fieldsHigh correlation
역명 is highly overall correlated with 면적High correlation
면적 has 50 (74.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 01:36:47.928792
Analysis finished2023-12-12 01:36:49.343295
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
부산교통공사
67 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 67
100.0%

Length

2023-12-12T10:36:49.433510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:36:49.579782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 67
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
3호선
67 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3호선
2nd row3호선
3rd row3호선
4th row3호선
5th row3호선

Common Values

ValueCountFrequency (%)
3호선 67
100.0%

Length

2023-12-12T10:36:49.736618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:36:49.887026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3호선 67
100.0%

역명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Memory size668.0 B
강서구청
체육공원
배산
물만골
미남
 
4
Other values (12)
41 

Length

Max length12
Median length10
Mean length4.4925373
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서구청
2nd row강서구청
3rd row강서구청
4th row강서구청
5th row강서구청

Common Values

ValueCountFrequency (%)
강서구청 6
 
9.0%
체육공원 6
 
9.0%
배산 5
 
7.5%
물만골 5
 
7.5%
미남 4
 
6.0%
종합운동장 4
 
6.0%
숙등(부민병원) 4
 
6.0%
망미(병무청) 4
 
6.0%
덕천(부산과기대) 4
 
6.0%
대저 4
 
6.0%
Other values (7) 21
31.3%

Length

2023-12-12T10:36:50.047512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강서구청 6
 
9.0%
체육공원 6
 
9.0%
배산 5
 
7.5%
물만골 5
 
7.5%
망미(병무청 4
 
6.0%
대저 4
 
6.0%
덕천(부산과기대 4
 
6.0%
숙등(부민병원 4
 
6.0%
종합운동장 4
 
6.0%
미남 4
 
6.0%
Other values (7) 21
31.3%

지상구분
Categorical

Distinct3
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
지하
39 
지상
27 
중간
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row지상
2nd row지상
3rd row지상
4th row지상
5th row지상

Common Values

ValueCountFrequency (%)
지하 39
58.2%
지상 27
40.3%
중간 1
 
1.5%

Length

2023-12-12T10:36:50.243701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:36:50.400862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 39
58.2%
지상 27
40.3%
중간 1
 
1.5%

역층
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3283582
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T10:36:50.536856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile5.7
Maximum9
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7700202
Coefficient of variation (CV)0.76020098
Kurtosis3.5561662
Mean2.3283582
Median Absolute Deviation (MAD)1
Skewness1.8135392
Sum156
Variance3.1329715
MonotonicityNot monotonic
2023-12-12T10:36:50.697806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 30
44.8%
2 14
20.9%
3 11
 
16.4%
4 5
 
7.5%
5 3
 
4.5%
9 1
 
1.5%
6 1
 
1.5%
7 1
 
1.5%
8 1
 
1.5%
ValueCountFrequency (%)
1 30
44.8%
2 14
20.9%
3 11
 
16.4%
4 5
 
7.5%
5 3
 
4.5%
6 1
 
1.5%
7 1
 
1.5%
8 1
 
1.5%
9 1
 
1.5%
ValueCountFrequency (%)
9 1
 
1.5%
8 1
 
1.5%
7 1
 
1.5%
6 1
 
1.5%
5 3
 
4.5%
4 5
 
7.5%
3 11
 
16.4%
2 14
20.9%
1 30
44.8%

면적
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct17
Distinct (%)100.0%
Missing50
Missing (%)74.6%
Infinite0
Infinite (%)0.0%
Mean2156
Minimum1366
Maximum2996
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-12T10:36:50.839654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1366
5-th percentile1640.4
Q11718
median2065
Q32559
95-th percentile2815.2
Maximum2996
Range1630
Interquartile range (IQR)841

Descriptive statistics

Standard deviation478.8524
Coefficient of variation (CV)0.22210223
Kurtosis-1.1152378
Mean2156
Median Absolute Deviation (MAD)354
Skewness0.27445862
Sum36652
Variance229299.62
MonotonicityNot monotonic
2023-12-12T10:36:50.996192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
2770 1
 
1.5%
2115 1
 
1.5%
2541 1
 
1.5%
2559 1
 
1.5%
1709 1
 
1.5%
2725 1
 
1.5%
2189 1
 
1.5%
1711 1
 
1.5%
1952 1
 
1.5%
2065 1
 
1.5%
Other values (7) 7
 
10.4%
(Missing) 50
74.6%
ValueCountFrequency (%)
1366 1
1.5%
1709 1
1.5%
1711 1
1.5%
1715 1
1.5%
1718 1
1.5%
1800 1
1.5%
1952 1
1.5%
1986 1
1.5%
2065 1
1.5%
2115 1
1.5%
ValueCountFrequency (%)
2996 1
1.5%
2770 1
1.5%
2735 1
1.5%
2725 1
1.5%
2559 1
1.5%
2541 1
1.5%
2189 1
1.5%
2115 1
1.5%
2065 1
1.5%
1986 1
1.5%

Interactions

2023-12-12T10:36:48.792522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:48.157149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:48.945836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:36:48.307971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:36:51.120202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명지상구분역층면적
역명1.0000.0000.0001.000
지상구분0.0001.0000.0000.000
역층0.0000.0001.0000.000
면적1.0000.0000.0001.000
2023-12-12T10:36:51.231798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명지상구분
역명1.0000.000
지상구분0.0001.000
2023-12-12T10:36:51.336828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역층면적역명지상구분
역층1.000-0.5070.0000.000
면적-0.5071.0001.0000.137
역명0.0001.0001.0000.000
지상구분0.0000.1370.0001.000

Missing values

2023-12-12T10:36:49.113825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:36:49.275976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상구분역층면적
0부산교통공사3호선강서구청지상1<NA>
1부산교통공사3호선강서구청지상2<NA>
2부산교통공사3호선강서구청지상3<NA>
3부산교통공사3호선강서구청지상4<NA>
4부산교통공사3호선강서구청지상51952
5부산교통공사3호선강서구청지하4<NA>
6부산교통공사3호선거제(법원·검찰청)지상1<NA>
7부산교통공사3호선거제(법원·검찰청)지하1<NA>
8부산교통공사3호선거제(법원·검찰청)지하22770
9부산교통공사3호선구포지상12735
철도운영기관명선명역명지상구분역층면적
57부산교통공사3호선종합운동장지상1<NA>
58부산교통공사3호선종합운동장지하1<NA>
59부산교통공사3호선종합운동장지하2<NA>
60부산교통공사3호선종합운동장지하32541
61부산교통공사3호선체육공원중간3<NA>
62부산교통공사3호선체육공원지상1<NA>
63부산교통공사3호선체육공원지상2<NA>
64부산교통공사3호선체육공원지상3<NA>
65부산교통공사3호선체육공원지상42115
66부산교통공사3호선체육공원지하3<NA>