Overview

Dataset statistics

Number of variables10
Number of observations33
Missing cells22
Missing cells (%)6.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory86.0 B

Variable types

Categorical8
Text2

Dataset

Description부산2호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 지상지하구분, 제세동기의 역층, 출입구번호, 상세위치, 제세동기출력에너지, 제세동기운영방식, 수량의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041481/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
수량 has constant value ""Constant
역층 is highly overall correlated with 상세위치High correlation
지상지하구분 is highly overall correlated with 상세위치High correlation
상세위치 is highly overall correlated with 지상지하구분 and 1 other fieldsHigh correlation
지상지하구분 is highly imbalanced (56.1%)Imbalance
역층 is highly imbalanced (67.0%)Imbalance
상세위치 is highly imbalanced (57.8%)Imbalance
제세동기운영방식 is highly imbalanced (58.7%)Imbalance
출입구번호 has 22 (66.7%) missing valuesMissing
역명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:43:30.992647
Analysis finished2023-12-12 03:43:31.735083
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
부산교통공사
33 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산교통공사
2nd row부산교통공사
3rd row부산교통공사
4th row부산교통공사
5th row부산교통공사

Common Values

ValueCountFrequency (%)
부산교통공사 33
100.0%

Length

2023-12-12T12:43:31.813561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:31.935516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산교통공사 33
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
2호선
33 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2호선
2nd row2호선
3rd row2호선
4th row2호선
5th row2호선

Common Values

ValueCountFrequency (%)
2호선 33
100.0%

Length

2023-12-12T12:43:32.097512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:32.248970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2호선 33
100.0%

역명
Text

UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
2023-12-12T12:43:32.503272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.5454545
Min length2

Characters and Unicode

Total characters84
Distinct characters61
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)100.0%

Sample

1st row장산
2nd row중동
3rd row해운대
4th row동백
5th row벡스코
ValueCountFrequency (%)
장산 1
 
3.0%
냉정 1
 
3.0%
금곡 1
 
3.0%
동원 1
 
3.0%
율리 1
 
3.0%
화명 1
 
3.0%
수정 1
 
3.0%
덕천 1
 
3.0%
구명 1
 
3.0%
구남 1
 
3.0%
Other values (23) 23
69.7%
2023-12-12T12:43:33.062145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (51) 58
69.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 84
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (51) 58
69.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 84
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (51) 58
69.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 84
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
4.8%
3
 
3.6%
3
 
3.6%
3
 
3.6%
3
 
3.6%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
2
 
2.4%
Other values (51) 58
69.0%

지상지하구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
지하
30 
지상
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지하
2nd row지하
3rd row지하
4th row지하
5th row지하

Common Values

ValueCountFrequency (%)
지하 30
90.9%
지상 3
 
9.1%

Length

2023-12-12T12:43:33.243152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:33.777175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지하 30
90.9%
지상 3
 
9.1%

역층
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
1
31 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 31
93.9%
2 2
 
6.1%

Length

2023-12-12T12:43:33.942588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:34.119948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 31
93.9%
2 2
 
6.1%

출입구번호
Text

MISSING 

Distinct7
Distinct (%)63.6%
Missing22
Missing (%)66.7%
Memory size396.0 B
2023-12-12T12:43:34.260350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.2727273
Min length3

Characters and Unicode

Total characters36
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)54.5%

Sample

1st row8번
2nd row4번
3rd row3번
4th row2번
5th row2번
ValueCountFrequency (%)
2번 6
50.0%
1번 2
 
16.7%
8번 1
 
8.3%
4번 1
 
8.3%
3번 1
 
8.3%
5번 1
 
8.3%
2023-12-12T12:43:34.641987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
33.3%
12
33.3%
2 6
16.7%
1 2
 
5.6%
8 1
 
2.8%
4 1
 
2.8%
3 1
 
2.8%
5 1
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12
33.3%
Space Separator 12
33.3%
Decimal Number 12
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 6
50.0%
1 2
 
16.7%
8 1
 
8.3%
4 1
 
8.3%
3 1
 
8.3%
5 1
 
8.3%
Other Letter
ValueCountFrequency (%)
12
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24
66.7%
Hangul 12
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
12
50.0%
2 6
25.0%
1 2
 
8.3%
8 1
 
4.2%
4 1
 
4.2%
3 1
 
4.2%
5 1
 
4.2%
Hangul
ValueCountFrequency (%)
12
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
66.7%
Hangul 12
33.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
100.0%
ASCII
ValueCountFrequency (%)
12
50.0%
2 6
25.0%
1 2
 
8.3%
8 1
 
4.2%
4 1
 
4.2%
3 1
 
4.2%
5 1
 
4.2%

상세위치
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
(B1) 역무안전실
28 
(B2) 역무안전실
 
2
(1F) 역무안전실
 
2
(1F) 대합실
 
1

Length

Max length10
Median length10
Mean length9.9393939
Min length8

Unique

Unique1 ?
Unique (%)3.0%

Sample

1st row(B1) 역무안전실
2nd row(B1) 역무안전실
3rd row(B1) 역무안전실
4th row(B1) 역무안전실
5th row(B1) 역무안전실

Common Values

ValueCountFrequency (%)
(B1) 역무안전실 28
84.8%
(B2) 역무안전실 2
 
6.1%
(1F) 역무안전실 2
 
6.1%
(1F) 대합실 1
 
3.0%

Length

2023-12-12T12:43:34.827593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:35.014685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
역무안전실 32
48.5%
b1 28
42.4%
1f 3
 
4.5%
b2 2
 
3.0%
대합실 1
 
1.5%
Distinct4
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
150
13 
성인 150 / 소아 50
10 
200
성인150 소아50
 
1

Length

Max length14
Median length3
Mean length6.5454545
Min length3

Unique

Unique1 ?
Unique (%)3.0%

Sample

1st row200
2nd row200
3rd row150
4th row200
5th row200

Common Values

ValueCountFrequency (%)
150 13
39.4%
성인 150 / 소아 50 10
30.3%
200 9
27.3%
성인150 소아50 1
 
3.0%

Length

2023-12-12T12:43:35.184628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:35.332126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
150 23
31.1%
성인 10
13.5%
10
13.5%
소아 10
13.5%
50 10
13.5%
200 9
 
12.2%
성인150 1
 
1.4%
소아50 1
 
1.4%

제세동기운영방식
Categorical

IMBALANCE 

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
자동
29 
자동/수동
 
2
수동
 
2

Length

Max length5
Median length2
Mean length2.1818182
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자동
2nd row자동
3rd row자동
4th row자동
5th row자동

Common Values

ValueCountFrequency (%)
자동 29
87.9%
자동/수동 2
 
6.1%
수동 2
 
6.1%

Length

2023-12-12T12:43:35.502396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:35.660905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자동 29
87.9%
자동/수동 2
 
6.1%
수동 2
 
6.1%

수량
Categorical

CONSTANT 

Distinct1
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
1
33 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 33
100.0%

Length

2023-12-12T12:43:35.821647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:43:35.964967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 33
100.0%

Correlations

2023-12-12T12:43:36.060115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명지상지하구분역층출입구번호상세위치제세동기출력에너지제세동기운영방식
역명1.0001.0001.0001.0001.0001.0001.000
지상지하구분1.0001.0000.0000.0001.0000.0000.000
역층1.0000.0001.000NaN1.0000.1450.000
출입구번호1.0000.000NaN1.0000.0000.6951.000
상세위치1.0001.0001.0000.0001.0000.2060.000
제세동기출력에너지1.0000.0000.1450.6950.2061.0000.144
제세동기운영방식1.0000.0000.0001.0000.0000.1441.000
2023-12-12T12:43:36.243537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
제세동기출력에너지역층지상지하구분상세위치제세동기운영방식
제세동기출력에너지1.0000.0750.0000.0610.122
역층0.0751.0000.0000.9670.000
지상지하구분0.0000.0001.0000.9670.000
상세위치0.0610.9670.9671.0000.000
제세동기운영방식0.1220.0000.0000.0001.000
2023-12-12T12:43:36.414921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지상지하구분역층상세위치제세동기출력에너지제세동기운영방식
지상지하구분1.0000.0000.9670.0000.000
역층0.0001.0000.9670.0750.000
상세위치0.9670.9671.0000.0610.000
제세동기출력에너지0.0000.0750.0611.0000.122
제세동기운영방식0.0000.0000.0000.1221.000

Missing values

2023-12-12T12:43:31.462539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:43:31.668905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지상지하구분역층출입구번호상세위치제세동기출력에너지제세동기운영방식수량
0부산교통공사2호선장산지하18번(B1) 역무안전실200자동1
1부산교통공사2호선중동지하14번(B1) 역무안전실200자동1
2부산교통공사2호선해운대지하1<NA>(B1) 역무안전실150자동1
3부산교통공사2호선동백지하1<NA>(B1) 역무안전실200자동1
4부산교통공사2호선벡스코지하1<NA>(B1) 역무안전실200자동1
5부산교통공사2호선센텀시티지하1<NA>(B1) 역무안전실200자동1
6부산교통공사2호선민락지하1<NA>(B1) 역무안전실150자동1
7부산교통공사2호선수영지하1<NA>(B1) 역무안전실150자동1
8부산교통공사2호선광안지하2<NA>(B2) 역무안전실150자동1
9부산교통공사2호선금련산지하2<NA>(B2) 역무안전실150자동1
철도운영기관명선명역명지상지하구분역층출입구번호상세위치제세동기출력에너지제세동기운영방식수량
23부산교통공사2호선모라지하12번(B1) 역무안전실150자동1
24부산교통공사2호선구남지하12번(B1) 역무안전실성인150 소아50자동1
25부산교통공사2호선구명지하11번 2번(B1) 역무안전실성인 150 / 소아 50자동1
26부산교통공사2호선덕천지하11번(B1) 역무안전실200자동1
27부산교통공사2호선수정지하1<NA>(B1) 역무안전실성인 150 / 소아 50자동1
28부산교통공사2호선화명지하1<NA>(B1) 역무안전실성인 150 / 소아 50자동1
29부산교통공사2호선율리지하1<NA>(B1) 역무안전실성인 150 / 소아 50자동1
30부산교통공사2호선동원지상1<NA>(1F) 역무안전실성인 150 / 소아 50자동1
31부산교통공사2호선금곡지상1<NA>(1F) 역무안전실성인 150 / 소아 50자동1
32부산교통공사2호선호포지상12번(1F) 대합실150자동1