Overview

Dataset statistics

Number of variables5
Number of observations301
Missing cells11
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.9 KiB
Average record size in memory40.4 B

Variable types

Categorical2
Text3

Dataset

Description코레일에서 관리하는 도시광역철도역들의 철도운영기관명, 선명, 역명, 지번주소, 도로명주소의 데이터가 있습니다.
Author국가철도공단
URLhttps://www.data.go.kr/data/15041113/fileData.do

Alerts

철도운영기관명 has constant value ""Constant
지번주소 has 10 (3.3%) missing valuesMissing

Reproduction

Analysis started2023-12-11 23:48:59.358573
Analysis finished2023-12-11 23:49:00.160450
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

철도운영기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
코레일
301 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row코레일
2nd row코레일
3rd row코레일
4th row코레일
5th row코레일

Common Values

ValueCountFrequency (%)
코레일 301
100.0%

Length

2023-12-12T08:49:00.234559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:49:00.356490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
코레일 301
100.0%

선명
Categorical

Distinct8
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
1호선
89 
수인분당
63 
경의중앙
58 
경춘
25 
동해
23 
Other values (3)
43 

Length

Max length4
Median length3
Mean length3.2059801
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 89
29.6%
수인분당 63
20.9%
경의중앙 58
19.3%
경춘 25
 
8.3%
동해 23
 
7.6%
4호선 22
 
7.3%
경강 11
 
3.7%
3호선 10
 
3.3%

Length

2023-12-12T08:49:00.504930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:49:00.625167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 89
29.6%
수인분당 63
20.9%
경의중앙 58
19.3%
경춘 25
 
8.3%
동해 23
 
7.6%
4호선 22
 
7.3%
경강 11
 
3.7%
3호선 10
 
3.3%

역명
Text

Distinct279
Distinct (%)92.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-12T08:49:00.940875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length2
Mean length3.1328904
Min length2

Characters and Unicode

Total characters943
Distinct characters231
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique259 ?
Unique (%)86.0%

Sample

1st row가능
2nd row가산디지털단지
3rd row간석
4th row개봉
5th row관악
ValueCountFrequency (%)
청량리 3
 
1.0%
회기 3
 
1.0%
초지 2
 
0.7%
안산 2
 
0.7%
중앙 2
 
0.7%
신길온천 2
 
0.7%
오이도 2
 
0.7%
상봉 2
 
0.7%
고잔 2
 
0.7%
정왕 2
 
0.7%
Other values (269) 279
92.7%
2023-12-12T08:49:01.475729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
37
 
3.9%
) 28
 
3.0%
( 28
 
3.0%
27
 
2.9%
26
 
2.8%
22
 
2.3%
22
 
2.3%
16
 
1.7%
15
 
1.6%
15
 
1.6%
Other values (221) 707
75.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 885
93.8%
Close Punctuation 28
 
3.0%
Open Punctuation 28
 
3.0%
Other Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
4.2%
27
 
3.1%
26
 
2.9%
22
 
2.5%
22
 
2.5%
16
 
1.8%
15
 
1.7%
15
 
1.7%
14
 
1.6%
14
 
1.6%
Other values (218) 677
76.5%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 885
93.8%
Common 58
 
6.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
4.2%
27
 
3.1%
26
 
2.9%
22
 
2.5%
22
 
2.5%
16
 
1.8%
15
 
1.7%
15
 
1.7%
14
 
1.6%
14
 
1.6%
Other values (218) 677
76.5%
Common
ValueCountFrequency (%)
) 28
48.3%
( 28
48.3%
· 2
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 885
93.8%
ASCII 56
 
5.9%
None 2
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
37
 
4.2%
27
 
3.1%
26
 
2.9%
22
 
2.5%
22
 
2.5%
16
 
1.8%
15
 
1.7%
15
 
1.7%
14
 
1.6%
14
 
1.6%
Other values (218) 677
76.5%
ASCII
ValueCountFrequency (%)
) 28
50.0%
( 28
50.0%
None
ValueCountFrequency (%)
· 2
100.0%

지번주소
Text

MISSING 

Distinct275
Distinct (%)94.5%
Missing10
Missing (%)3.3%
Memory size2.5 KiB
2023-12-12T08:49:01.897332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length19.182131
Min length7

Characters and Unicode

Total characters5582
Distinct characters209
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique260 ?
Unique (%)89.3%

Sample

1st row경기도 의정부시 가능동 197-1
2nd row서울특별시 금천구 가산동 468-4
3rd row인천광역시 남동구 간석4동 762번지
4th row서울특별시 구로구 개보동 415
5th row경기도 안양시 만악구 석수동 101-16
ValueCountFrequency (%)
경기도 159
 
12.5%
서울특별시 54
 
4.3%
고양시 20
 
1.6%
인천광역시 16
 
1.3%
부산광역시 15
 
1.2%
안산시 15
 
1.2%
남양주시 13
 
1.0%
수원시 13
 
1.0%
성남시 12
 
0.9%
강남구 10
 
0.8%
Other values (608) 941
74.2%
2023-12-12T08:49:02.426754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
984
 
17.6%
276
 
4.9%
275
 
4.9%
1 246
 
4.4%
- 207
 
3.7%
190
 
3.4%
187
 
3.4%
2 175
 
3.1%
170
 
3.0%
163
 
2.9%
Other values (199) 2709
48.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3238
58.0%
Decimal Number 1149
 
20.6%
Space Separator 984
 
17.6%
Dash Punctuation 207
 
3.7%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
276
 
8.5%
275
 
8.5%
190
 
5.9%
187
 
5.8%
170
 
5.3%
163
 
5.0%
86
 
2.7%
77
 
2.4%
75
 
2.3%
70
 
2.2%
Other values (185) 1669
51.5%
Decimal Number
ValueCountFrequency (%)
1 246
21.4%
2 175
15.2%
3 118
10.3%
6 108
9.4%
5 107
9.3%
4 96
 
8.4%
7 87
 
7.6%
8 86
 
7.5%
0 64
 
5.6%
9 62
 
5.4%
Space Separator
ValueCountFrequency (%)
984
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 207
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3238
58.0%
Common 2344
42.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
276
 
8.5%
275
 
8.5%
190
 
5.9%
187
 
5.8%
170
 
5.3%
163
 
5.0%
86
 
2.7%
77
 
2.4%
75
 
2.3%
70
 
2.2%
Other values (185) 1669
51.5%
Common
ValueCountFrequency (%)
984
42.0%
1 246
 
10.5%
- 207
 
8.8%
2 175
 
7.5%
3 118
 
5.0%
6 108
 
4.6%
5 107
 
4.6%
4 96
 
4.1%
7 87
 
3.7%
8 86
 
3.7%
Other values (4) 130
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3238
58.0%
ASCII 2344
42.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
984
42.0%
1 246
 
10.5%
- 207
 
8.8%
2 175
 
7.5%
3 118
 
5.0%
6 108
 
4.6%
5 107
 
4.6%
4 96
 
4.1%
7 87
 
3.7%
8 86
 
3.7%
Other values (4) 130
 
5.5%
Hangul
ValueCountFrequency (%)
276
 
8.5%
275
 
8.5%
190
 
5.9%
187
 
5.8%
170
 
5.3%
163
 
5.0%
86
 
2.7%
77
 
2.4%
75
 
2.3%
70
 
2.2%
Other values (185) 1669
51.5%
Distinct288
Distinct (%)96.0%
Missing1
Missing (%)0.3%
Memory size2.5 KiB
2023-12-12T08:49:02.803739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length31.5
Mean length19.72
Min length12

Characters and Unicode

Total characters5916
Distinct characters230
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique277 ?
Unique (%)92.3%

Sample

1st row경기도 의정부시 평화로 633
2nd row서울특별시 금천구 벚꽃로 309
3rd row인천광역시 남동구 석정로 522-14
4th row서울특별시 구로구 경인로40길 47
5th row경기도 안양시 만안구 경수대로1273번길 46
ValueCountFrequency (%)
경기도 164
 
11.9%
서울특별시 54
 
3.9%
고양시 18
 
1.3%
부산광역시 17
 
1.2%
인천광역시 16
 
1.2%
안산시 15
 
1.1%
서울시 13
 
0.9%
지하 13
 
0.9%
수원시 13
 
0.9%
남양주시 13
 
0.9%
Other values (605) 1037
75.5%
2023-12-12T08:49:03.445283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1145
 
19.4%
291
 
4.9%
287
 
4.9%
198
 
3.3%
191
 
3.2%
189
 
3.2%
1 188
 
3.2%
183
 
3.1%
2 129
 
2.2%
3 106
 
1.8%
Other values (220) 3009
50.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3730
63.0%
Space Separator 1145
 
19.4%
Decimal Number 956
 
16.2%
Dash Punctuation 30
 
0.5%
Close Punctuation 26
 
0.4%
Open Punctuation 26
 
0.4%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
291
 
7.8%
287
 
7.7%
198
 
5.3%
191
 
5.1%
189
 
5.1%
183
 
4.9%
91
 
2.4%
83
 
2.2%
82
 
2.2%
79
 
2.1%
Other values (205) 2056
55.1%
Decimal Number
ValueCountFrequency (%)
1 188
19.7%
2 129
13.5%
3 106
11.1%
5 91
9.5%
0 85
8.9%
4 81
8.5%
7 81
8.5%
6 68
 
7.1%
9 67
 
7.0%
8 60
 
6.3%
Space Separator
ValueCountFrequency (%)
1145
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3730
63.0%
Common 2186
37.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
291
 
7.8%
287
 
7.7%
198
 
5.3%
191
 
5.1%
189
 
5.1%
183
 
4.9%
91
 
2.4%
83
 
2.2%
82
 
2.2%
79
 
2.1%
Other values (205) 2056
55.1%
Common
ValueCountFrequency (%)
1145
52.4%
1 188
 
8.6%
2 129
 
5.9%
3 106
 
4.8%
5 91
 
4.2%
0 85
 
3.9%
4 81
 
3.7%
7 81
 
3.7%
6 68
 
3.1%
9 67
 
3.1%
Other values (5) 145
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3730
63.0%
ASCII 2186
37.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1145
52.4%
1 188
 
8.6%
2 129
 
5.9%
3 106
 
4.8%
5 91
 
4.2%
0 85
 
3.9%
4 81
 
3.7%
7 81
 
3.7%
6 68
 
3.1%
9 67
 
3.1%
Other values (5) 145
 
6.6%
Hangul
ValueCountFrequency (%)
291
 
7.8%
287
 
7.7%
198
 
5.3%
191
 
5.1%
189
 
5.1%
183
 
4.9%
91
 
2.4%
83
 
2.2%
82
 
2.2%
79
 
2.1%
Other values (205) 2056
55.1%

Missing values

2023-12-12T08:48:59.944399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:49:00.029187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T08:49:00.115627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

철도운영기관명선명역명지번주소도로명주소
0코레일1호선가능경기도 의정부시 가능동 197-1경기도 의정부시 평화로 633
1코레일1호선가산디지털단지서울특별시 금천구 가산동 468-4서울특별시 금천구 벚꽃로 309
2코레일1호선간석인천광역시 남동구 간석4동 762번지인천광역시 남동구 석정로 522-14
3코레일1호선개봉서울특별시 구로구 개보동 415서울특별시 구로구 경인로40길 47
4코레일1호선관악경기도 안양시 만악구 석수동 101-16경기도 안양시 만안구 경수대로1273번길 46
5코레일1호선광명경기도 광명시 일직동 276-1경기도 광명시 광명역로 21
6코레일1호선광운대노원구 월계동 85노원구 석계로 98-2
7코레일1호선구로서울특별시 구로구 구로동 585-3서울특별시 구로구 구로중앙로 174
8코레일1호선구일서울시 구로구 구로1동 642-7서울시 구로구 구일로 133
9코레일1호선군포경기도 군포시 당동 134-1경기도 군포시 군포역1길 27
철도운영기관명선명역명지번주소도로명주소
291코레일동해센텀부산광역시 해운대구 재송동 646-2부산광역시 해운대구 해운대로 210
292코레일동해송정부산광역시 해운대구 송정동 120-1부산광역시 해운대구 해운대로 1147
293코레일동해신해운대부산광역시 해운대구 좌동 132부산광역시 해운대구 장산로 427
294코레일동해안락부산광역시 동래구 안락동 201-2부산광역시 동래구 안연로98번길 57
295코레일동해오시리아부산광역시 기장군 기장읍 당사리 261-8부산광역시 기장군 기장읍 동부산관광5로 14
296코레일동해월내<NA>부산광역시 기장군 장안읍 해맞이로 351-31
297코레일동해일광부산광역시 기장군 삼성리 21-1부산광역시 기장군 일광읍 일광로 111-10
298코레일동해재송부산광역시 해운대구 재송동 909-2부산광역시 해운대구 해운대로 100
299코레일동해좌천<NA>부산광역시 기장군 장안읍 좌천리 239-4
300코레일동해태화강<NA>울산광역시 남구 산업로 654