Overview

Dataset statistics

Number of variables5
Number of observations58
Missing cells3
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory42.3 B

Variable types

Categorical2
Text3

Dataset

Description경의중앙선에 포함된 도시광역철도역들의 철도운영기관명, 선명, 역명, 지번주소, 도로명주소의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041117/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
선명 has constant value ""Constant
지번주소 has 2 (3.4%) missing valuesMissing
도로명주소 has 1 (1.7%) missing valuesMissing
역명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:07:59.706629
Analysis finished2023-12-12 14:08:00.377260
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
코레일
58 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 58
100.0%

Length

2023-12-12T23:08:00.454407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:08:00.573949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 58
100.0%

선명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
경의중앙
58 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경의중앙
2nd row경의중앙
3rd row경의중앙
4th row경의중앙
5th row경의중앙

Common Values

ValueCountFrequency (%)
경의중앙 58
100.0%

Length

2023-12-12T23:08:00.693626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:08:00.788671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경의중앙 58
100.0%

역명
Text

UNIQUE 

Distinct58
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size596.0 B
2023-12-12T23:08:01.026406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length2
Mean length3.0862069
Min length2

Characters and Unicode

Total characters179
Distinct characters103
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)100.0%

Sample

1st row가좌
2nd row강매
3rd row곡산
4th row공덕
5th row구리
ValueCountFrequency (%)
가좌 1
 
1.7%
이촌 1
 
1.7%
회기 1
 
1.7%
오빈 1
 
1.7%
옥수 1
 
1.7%
왕십리 1
 
1.7%
용문 1
 
1.7%
용산 1
 
1.7%
운길산 1
 
1.7%
운정 1
 
1.7%
Other values (48) 48
82.8%
2023-12-12T23:08:01.430643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
4.5%
7
 
3.9%
) 6
 
3.4%
( 6
 
3.4%
6
 
3.4%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
3
 
1.7%
Other values (93) 125
69.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 167
93.3%
Close Punctuation 6
 
3.4%
Open Punctuation 6
 
3.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
4.8%
7
 
4.2%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
3
 
1.8%
3
 
1.8%
3
 
1.8%
Other values (91) 119
71.3%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 167
93.3%
Common 12
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
4.8%
7
 
4.2%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
3
 
1.8%
3
 
1.8%
3
 
1.8%
Other values (91) 119
71.3%
Common
ValueCountFrequency (%)
) 6
50.0%
( 6
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 167
93.3%
ASCII 12
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
4.8%
7
 
4.2%
6
 
3.6%
5
 
3.0%
5
 
3.0%
4
 
2.4%
4
 
2.4%
3
 
1.8%
3
 
1.8%
3
 
1.8%
Other values (91) 119
71.3%
ASCII
ValueCountFrequency (%)
) 6
50.0%
( 6
50.0%

지번주소
Text

MISSING 

Distinct56
Distinct (%)100.0%
Missing2
Missing (%)3.4%
Memory size596.0 B
2023-12-12T23:08:01.776342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21.5
Mean length19.678571
Min length11

Characters and Unicode

Total characters1102
Distinct characters108
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row서울특별시 서대문구 남가좌동 296-12
2nd row경기도 고양시 덕양구 행신동 1115-1
3rd row경기도 고양시 일산동구 백석동 1185-1
4th row서울특별시 마포구 도화동 25-13
5th row경기도 구리시 인창동 244-1
ValueCountFrequency (%)
경기도 32
 
13.0%
서울특별시 19
 
7.7%
고양시 10
 
4.0%
양평군 9
 
3.6%
파주시 8
 
3.2%
남양주시 6
 
2.4%
용산구 5
 
2.0%
중랑구 4
 
1.6%
덕양구 4
 
1.6%
양평읍 3
 
1.2%
Other values (128) 147
59.5%
2023-12-12T23:08:02.784399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
194
 
17.6%
1 49
 
4.4%
46
 
4.2%
45
 
4.1%
- 42
 
3.8%
36
 
3.3%
2 36
 
3.3%
35
 
3.2%
34
 
3.1%
33
 
3.0%
Other values (98) 552
50.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 634
57.5%
Decimal Number 232
 
21.1%
Space Separator 194
 
17.6%
Dash Punctuation 42
 
3.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
7.3%
45
 
7.1%
36
 
5.7%
35
 
5.5%
34
 
5.4%
33
 
5.2%
30
 
4.7%
28
 
4.4%
22
 
3.5%
19
 
3.0%
Other values (86) 306
48.3%
Decimal Number
ValueCountFrequency (%)
1 49
21.1%
2 36
15.5%
3 26
11.2%
5 22
9.5%
8 21
9.1%
4 19
 
8.2%
9 16
 
6.9%
0 16
 
6.9%
6 14
 
6.0%
7 13
 
5.6%
Space Separator
ValueCountFrequency (%)
194
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 634
57.5%
Common 468
42.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
7.3%
45
 
7.1%
36
 
5.7%
35
 
5.5%
34
 
5.4%
33
 
5.2%
30
 
4.7%
28
 
4.4%
22
 
3.5%
19
 
3.0%
Other values (86) 306
48.3%
Common
ValueCountFrequency (%)
194
41.5%
1 49
 
10.5%
- 42
 
9.0%
2 36
 
7.7%
3 26
 
5.6%
5 22
 
4.7%
8 21
 
4.5%
4 19
 
4.1%
9 16
 
3.4%
0 16
 
3.4%
Other values (2) 27
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 634
57.5%
ASCII 468
42.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
194
41.5%
1 49
 
10.5%
- 42
 
9.0%
2 36
 
7.7%
3 26
 
5.6%
5 22
 
4.7%
8 21
 
4.5%
4 19
 
4.1%
9 16
 
3.4%
0 16
 
3.4%
Other values (2) 27
 
5.8%
Hangul
ValueCountFrequency (%)
46
 
7.3%
45
 
7.1%
36
 
5.7%
35
 
5.5%
34
 
5.4%
33
 
5.2%
30
 
4.7%
28
 
4.4%
22
 
3.5%
19
 
3.0%
Other values (86) 306
48.3%

도로명주소
Text

MISSING 

Distinct57
Distinct (%)100.0%
Missing1
Missing (%)1.7%
Memory size596.0 B
2023-12-12T23:08:03.155765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length23
Mean length19.175439
Min length12

Characters and Unicode

Total characters1093
Distinct characters120
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)100.0%

Sample

1st row서울시 서대문구 수색로 27
2nd row경기도 고양시 덕양구 소원로202 (행신동,강매역사)
3rd row경기도 고양시 일산동구 경의로 160
4th row서울특별시 마포구 마포대로 92
5th row경기도 구리시 건원대로 34번길 32-29
ValueCountFrequency (%)
경기도 34
 
13.2%
서울특별시 18
 
7.0%
양평군 9
 
3.5%
고양시 9
 
3.5%
파주시 9
 
3.5%
남양주시 6
 
2.3%
용산구 5
 
1.9%
경의로 5
 
1.9%
중랑구 4
 
1.6%
서울시 4
 
1.6%
Other values (128) 154
59.9%
2023-12-12T23:08:03.700507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
214
 
19.6%
48
 
4.4%
46
 
4.2%
41
 
3.8%
36
 
3.3%
35
 
3.2%
1 34
 
3.1%
34
 
3.1%
32
 
2.9%
32
 
2.9%
Other values (110) 541
49.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 699
64.0%
Space Separator 214
 
19.6%
Decimal Number 167
 
15.3%
Dash Punctuation 6
 
0.5%
Close Punctuation 3
 
0.3%
Open Punctuation 3
 
0.3%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
6.9%
46
 
6.6%
41
 
5.9%
36
 
5.2%
35
 
5.0%
34
 
4.9%
32
 
4.6%
32
 
4.6%
22
 
3.1%
21
 
3.0%
Other values (95) 352
50.4%
Decimal Number
ValueCountFrequency (%)
1 34
20.4%
2 25
15.0%
3 23
13.8%
8 15
9.0%
0 15
9.0%
7 14
8.4%
5 14
8.4%
4 10
 
6.0%
6 9
 
5.4%
9 8
 
4.8%
Space Separator
ValueCountFrequency (%)
214
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 699
64.0%
Common 394
36.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
6.9%
46
 
6.6%
41
 
5.9%
36
 
5.2%
35
 
5.0%
34
 
4.9%
32
 
4.6%
32
 
4.6%
22
 
3.1%
21
 
3.0%
Other values (95) 352
50.4%
Common
ValueCountFrequency (%)
214
54.3%
1 34
 
8.6%
2 25
 
6.3%
3 23
 
5.8%
8 15
 
3.8%
0 15
 
3.8%
7 14
 
3.6%
5 14
 
3.6%
4 10
 
2.5%
6 9
 
2.3%
Other values (5) 21
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 699
64.0%
ASCII 394
36.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
214
54.3%
1 34
 
8.6%
2 25
 
6.3%
3 23
 
5.8%
8 15
 
3.8%
0 15
 
3.8%
7 14
 
3.6%
5 14
 
3.6%
4 10
 
2.5%
6 9
 
2.3%
Other values (5) 21
 
5.3%
Hangul
ValueCountFrequency (%)
48
 
6.9%
46
 
6.6%
41
 
5.9%
36
 
5.2%
35
 
5.0%
34
 
4.9%
32
 
4.6%
32
 
4.6%
22
 
3.1%
21
 
3.0%
Other values (95) 352
50.4%

Correlations

2023-12-12T23:08:03.813606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
역명지번주소도로명주소
역명1.0001.0001.000
지번주소1.0001.0001.000
도로명주소1.0001.0001.000

Missing values

2023-12-12T23:08:00.046441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:08:00.207123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:08:00.315461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

철도운영기관명선명역명지번주소도로명주소
0코레일경의중앙가좌서울특별시 서대문구 남가좌동 296-12서울시 서대문구 수색로 27
1코레일경의중앙강매경기도 고양시 덕양구 행신동 1115-1경기도 고양시 덕양구 소원로202 (행신동,강매역사)
2코레일경의중앙곡산경기도 고양시 일산동구 백석동 1185-1경기도 고양시 일산동구 경의로 160
3코레일경의중앙공덕서울특별시 마포구 도화동 25-13서울특별시 마포구 마포대로 92
4코레일경의중앙구리경기도 구리시 인창동 244-1경기도 구리시 건원대로 34번길 32-29
5코레일경의중앙국수경기도 양평군 양서면 국수리 258-25경기도 양평군 양서면 국수역길 45
6코레일경의중앙금릉경기도 파주시 금촌동 605-1경기도 파주시 금릉역로 85
7코레일경의중앙금촌경기도 파주시 금촌동 329-355경기도 파주시 새꽃로 193번지
8코레일경의중앙능곡경기도 고양시 덕양구 토당동 454-3경기도 고양시 덕양구 토당로 35
9코레일경의중앙대곡경기도 고양시 대장동 426-3경기도 교양시 대주로 107번길 71-82
철도운영기관명선명역명지번주소도로명주소
48코레일경의중앙탄현경기도 고양시 일산서구 덕이동 238-12경기도 고양시 일산서구 경의로 856
49코레일경의중앙파주(두원대학)경기도 파주시 파주읍 봉암리487-4경기도 파주시 파주읍 주라위길 38
50코레일경의중앙팔당경기도 남양주시 와부읍 팔당리 360경기도 남양주시 와부읍 팔당로 107
51코레일경의중앙풍산경기도 고양시 일산동구 풍동 1042경기도 고양시 일산동구 경의로 486
52코레일경의중앙한남서울시 용산구 한남동서울특별시 용산구 독서당로6길 12-13
53코레일경의중앙행신경기도 고양시 덕양구 행신동 812경기도 고양시 덕양구 소원로 102
54코레일경의중앙홍대입구서울특별시 마포구 동교동 190-1서울특별시 마포구 양화로 지하188
55코레일경의중앙화전(한국항공대)경기도 고양시 덕양구 화전동 183-10경기도 고양시 덕양구 화랑로 53
56코레일경의중앙회기서울특별시 동대문구 휘경동 317-102서울특별시 동대문구 회기로 196
57코레일경의중앙효창공원앞서울특별시 용산구 효창동 80서울특별시 용산구 원효로 71길 40