Overview

Dataset statistics

Number of variables5
Number of observations99
Missing cells1
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

Categorical2
Text3

Dataset

Description수도권1호선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 지번주소, 도로명주소의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041120/fileData.do

Alerts

선명 has constant value ""Constant
철도운영기관명 is highly imbalanced (52.8%)Imbalance
지번주소 has 1 (1.0%) missing valuesMissing
역명 has unique valuesUnique
도로명주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:05:25.047221
Analysis finished2023-12-12 09:05:25.725828
Duration0.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
코레일
89 
서울교통공사
10 

Length

Max length6
Median length3
Mean length3.3030303
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

Length

2023-12-12T18:05:25.841421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:05:25.947474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 89
89.9%
서울교통공사 10
 
10.1%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
1호선
99 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 99
100.0%

Length

2023-12-12T18:05:26.065897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:05:26.168206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 99
100.0%

역명
Text

UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
2023-12-12T18:05:26.501274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length2.6464646
Min length2

Characters and Unicode

Total characters262
Distinct characters118
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)100.0%

Sample

1st row가능
2nd row가산디지털단지
3rd row간석
4th row개봉
5th row관악
ValueCountFrequency (%)
가능 1
 
1.0%
소요산 1
 
1.0%
의왕 1
 
1.0%
월계 1
 
1.0%
용산 1
 
1.0%
외대앞 1
 
1.0%
온양온천 1
 
1.0%
온수 1
 
1.0%
오산대 1
 
1.0%
오산 1
 
1.0%
Other values (89) 89
89.9%
2023-12-12T18:05:27.075027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
4.6%
10
 
3.8%
10
 
3.8%
9
 
3.4%
7
 
2.7%
5
 
1.9%
5
 
1.9%
5
 
1.9%
4
 
1.5%
4
 
1.5%
Other values (108) 191
72.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 254
96.9%
Open Punctuation 3
 
1.1%
Close Punctuation 3
 
1.1%
Decimal Number 2
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Decimal Number
ValueCountFrequency (%)
5 1
50.0%
3 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 254
96.9%
Common 8
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
Common
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
5 1
 
12.5%
3 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 254
96.9%
ASCII 8
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
4.7%
10
 
3.9%
10
 
3.9%
9
 
3.5%
7
 
2.8%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
4
 
1.6%
Other values (104) 183
72.0%
ASCII
ValueCountFrequency (%)
( 3
37.5%
) 3
37.5%
5 1
 
12.5%
3 1
 
12.5%

지번주소
Text

MISSING 

Distinct98
Distinct (%)100.0%
Missing1
Missing (%)1.0%
Memory size924.0 B
2023-12-12T18:05:27.481275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length28
Mean length19.316327
Min length7

Characters and Unicode

Total characters1893
Distinct characters151
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)100.0%

Sample

1st row경기도 의정부시 가능동 197-1
2nd row서울특별시 금천구 가산동 468-4
3rd row인천광역시 남동구 간석4동 762번지
4th row서울특별시 구로구 개보동 415
5th row경기도 안양시 만악구 석수동 101-16
ValueCountFrequency (%)
경기도 38
 
9.2%
서울특별시 28
 
6.8%
인천광역시 9
 
2.2%
충청남도 7
 
1.7%
동대문구 6
 
1.5%
서울시 6
 
1.5%
구로구 6
 
1.5%
천안시 5
 
1.2%
종로구 5
 
1.2%
동두천시 5
 
1.2%
Other values (238) 298
72.2%
2023-12-12T18:05:28.066393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
319
 
16.9%
101
 
5.3%
95
 
5.0%
1 84
 
4.4%
- 69
 
3.6%
69
 
3.6%
55
 
2.9%
6 46
 
2.4%
4 44
 
2.3%
5 42
 
2.2%
Other values (141) 969
51.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1098
58.0%
Decimal Number 383
 
20.2%
Space Separator 319
 
16.9%
Dash Punctuation 69
 
3.6%
Open Punctuation 12
 
0.6%
Close Punctuation 12
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
 
9.2%
95
 
8.7%
69
 
6.3%
55
 
5.0%
42
 
3.8%
40
 
3.6%
39
 
3.6%
36
 
3.3%
30
 
2.7%
28
 
2.6%
Other values (127) 563
51.3%
Decimal Number
ValueCountFrequency (%)
1 84
21.9%
6 46
12.0%
4 44
11.5%
5 42
11.0%
2 41
10.7%
3 40
10.4%
8 25
 
6.5%
7 23
 
6.0%
9 22
 
5.7%
0 16
 
4.2%
Space Separator
ValueCountFrequency (%)
319
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 69
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1098
58.0%
Common 795
42.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
 
9.2%
95
 
8.7%
69
 
6.3%
55
 
5.0%
42
 
3.8%
40
 
3.6%
39
 
3.6%
36
 
3.3%
30
 
2.7%
28
 
2.6%
Other values (127) 563
51.3%
Common
ValueCountFrequency (%)
319
40.1%
1 84
 
10.6%
- 69
 
8.7%
6 46
 
5.8%
4 44
 
5.5%
5 42
 
5.3%
2 41
 
5.2%
3 40
 
5.0%
8 25
 
3.1%
7 23
 
2.9%
Other values (4) 62
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1098
58.0%
ASCII 795
42.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
319
40.1%
1 84
 
10.6%
- 69
 
8.7%
6 46
 
5.8%
4 44
 
5.5%
5 42
 
5.3%
2 41
 
5.2%
3 40
 
5.0%
8 25
 
3.1%
7 23
 
2.9%
Other values (4) 62
 
7.8%
Hangul
ValueCountFrequency (%)
101
 
9.2%
95
 
8.7%
69
 
6.3%
55
 
5.0%
42
 
3.8%
40
 
3.6%
39
 
3.6%
36
 
3.3%
30
 
2.7%
28
 
2.6%
Other values (127) 563
51.3%

도로명주소
Text

UNIQUE 

Distinct99
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size924.0 B
2023-12-12T18:05:28.467199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length26
Mean length19.30303
Min length12

Characters and Unicode

Total characters1911
Distinct characters157
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)100.0%

Sample

1st row경기도 의정부시 평화로 633
2nd row서울특별시 금천구 벚꽃로 309
3rd row인천광역시 남동구 석정로 522-14
4th row서울특별시 구로구 경인로40길 47
5th row경기도 안양시 만안구 경수대로1273번길 46
ValueCountFrequency (%)
경기도 41
 
9.6%
서울특별시 29
 
6.8%
평화로 10
 
2.3%
인천광역시 10
 
2.3%
충청남도 9
 
2.1%
동대문구 6
 
1.4%
구로구 6
 
1.4%
천안시 6
 
1.4%
서울시 5
 
1.2%
종로 5
 
1.2%
Other values (219) 301
70.3%
2023-12-12T18:05:29.004101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
372
 
19.5%
113
 
5.9%
97
 
5.1%
72
 
3.8%
58
 
3.0%
1 58
 
3.0%
52
 
2.7%
2 51
 
2.7%
45
 
2.4%
40
 
2.1%
Other values (147) 953
49.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1188
62.2%
Space Separator 372
 
19.5%
Decimal Number 313
 
16.4%
Close Punctuation 15
 
0.8%
Open Punctuation 15
 
0.8%
Dash Punctuation 8
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
113
 
9.5%
97
 
8.2%
72
 
6.1%
58
 
4.9%
52
 
4.4%
45
 
3.8%
40
 
3.4%
37
 
3.1%
35
 
2.9%
30
 
2.5%
Other values (133) 609
51.3%
Decimal Number
ValueCountFrequency (%)
1 58
18.5%
2 51
16.3%
3 36
11.5%
5 35
11.2%
9 31
9.9%
7 29
9.3%
6 22
 
7.0%
4 22
 
7.0%
0 17
 
5.4%
8 12
 
3.8%
Space Separator
ValueCountFrequency (%)
372
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1188
62.2%
Common 723
37.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
113
 
9.5%
97
 
8.2%
72
 
6.1%
58
 
4.9%
52
 
4.4%
45
 
3.8%
40
 
3.4%
37
 
3.1%
35
 
2.9%
30
 
2.5%
Other values (133) 609
51.3%
Common
ValueCountFrequency (%)
372
51.5%
1 58
 
8.0%
2 51
 
7.1%
3 36
 
5.0%
5 35
 
4.8%
9 31
 
4.3%
7 29
 
4.0%
6 22
 
3.0%
4 22
 
3.0%
0 17
 
2.4%
Other values (4) 50
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1188
62.2%
ASCII 723
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
372
51.5%
1 58
 
8.0%
2 51
 
7.1%
3 36
 
5.0%
5 35
 
4.8%
9 31
 
4.3%
7 29
 
4.0%
6 22
 
3.0%
4 22
 
3.0%
0 17
 
2.4%
Other values (4) 50
 
6.9%
Hangul
ValueCountFrequency (%)
113
 
9.5%
97
 
8.2%
72
 
6.1%
58
 
4.9%
52
 
4.4%
45
 
3.8%
40
 
3.4%
37
 
3.1%
35
 
2.9%
30
 
2.5%
Other values (133) 609
51.3%

Correlations

2023-12-12T18:05:29.129843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
철도운영기관명역명지번주소도로명주소
철도운영기관명1.0001.0001.0001.000
역명1.0001.0001.0001.000
지번주소1.0001.0001.0001.000
도로명주소1.0001.0001.0001.000

Missing values

2023-12-12T18:05:25.543637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:05:25.676283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

철도운영기관명선명역명지번주소도로명주소
0코레일1호선가능경기도 의정부시 가능동 197-1경기도 의정부시 평화로 633
1코레일1호선가산디지털단지서울특별시 금천구 가산동 468-4서울특별시 금천구 벚꽃로 309
2코레일1호선간석인천광역시 남동구 간석4동 762번지인천광역시 남동구 석정로 522-14
3코레일1호선개봉서울특별시 구로구 개보동 415서울특별시 구로구 경인로40길 47
4코레일1호선관악경기도 안양시 만악구 석수동 101-16경기도 안양시 만안구 경수대로1273번길 46
5코레일1호선광명경기도 광명시 일직동 276-1경기도 광명시 광명역로 21
6코레일1호선광운대노원구 월계동 85노원구 석계로 98-2
7코레일1호선구로서울특별시 구로구 구로동 585-3서울특별시 구로구 구로중앙로 174
8코레일1호선구일서울시 구로구 구로1동 642-7서울시 구로구 구일로 133
9코레일1호선군포경기도 군포시 당동 134-1경기도 군포시 군포역1길 27
철도운영기관명선명역명지번주소도로명주소
89서울교통공사1호선동대문서울특별시 종로구 창신동 492-1 동대문역(1호선)서울특별시 종로구 종로 지하302(창신동)
90서울교통공사1호선동묘앞서울특별시 종로구 숭인동 117 동묘앞역(1호선)서울특별시 종로구 종로 359(숭인동)
91서울교통공사1호선서울역서울특별시 중구 남대문로5가 73-6 서울역(1호선)서울특별시 중구 세종대로 지하2(남대문로 5가)
92서울교통공사1호선시청서울특별시 중구 정동 5-5 시청역(1호선)서울특별시 중구 세종대로 지하101(정동)
93서울교통공사1호선신설동서울특별시 동대문구 신설동 76-5 신설동역(1호선)서울특별시 동대문구 왕산로 지하1(신설동)
94서울교통공사1호선제기동서울특별시 동대문구 제기동 65 제기동역(1호선)서울특별시 동대문구 왕산로 지하93(제기동)
95서울교통공사1호선종각서울특별시 종로구 종로1가 54 종각역(1호선)서울특별시 종로구 종로 지하55(종로1가)
96서울교통공사1호선종로3가서울특별시 종로구 종로3가 10-5 종로3가역(1호선)서울특별시 종로구 종로 지하129(종로3가)
97서울교통공사1호선종로5가서울특별시 종로구 종로5가 82-1 종로5가역(1호선)서울특별시 종로구 종로 지하216(종로5가)
98서울교통공사1호선청량리(서울시립대입구)서울특별시 동대문구 전농동 620-69 청량리역(1호선)서울특별시 동대문구 왕산로 지하205(전농동)