Overview

Dataset statistics

Number of variables4
Number of observations59
Missing cells9
Missing cells (%)3.8%
Duplicate rows1
Duplicate rows (%)1.7%
Total size in memory2.0 KiB
Average record size in memory34.2 B

Variable types

Categorical1
Text3

Dataset

Description인천교통공사 2017년12월31일 기준 호선별 역사 전화번호 및 주소입니다.(구분, 역명, 전화번호, 주소)
URLhttps://www.data.go.kr/data/15043813/fileData.do

Alerts

Dataset has 1 (1.7%) duplicate rowsDuplicates
역명 has 3 (5.1%) missing valuesMissing
전화번호 has 3 (5.1%) missing valuesMissing
주 소 has 3 (5.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:35:19.609691
Analysis finished2023-12-12 14:35:20.198210
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct3
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size604.0 B
1호선
29 
2호선
27 

Length

Max length3
Median length3
Mean length2.8983051
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1호선
2nd row1호선
3rd row1호선
4th row1호선
5th row1호선

Common Values

ValueCountFrequency (%)
1호선 29
49.2%
2호선 27
45.8%
3
 
5.1%

Length

2023-12-12T23:35:20.313443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:35:20.767818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1호선 29
51.8%
2호선 27
48.2%

역명
Text

MISSING 

Distinct55
Distinct (%)98.2%
Missing3
Missing (%)5.1%
Memory size604.0 B
2023-12-12T23:35:21.029542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.625
Min length3

Characters and Unicode

Total characters259
Distinct characters100
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)96.4%

Sample

1st row계양역
2nd row귤현역
3rd row박촌역
4th row임학역
5th row계산역
ValueCountFrequency (%)
인천시청역 2
 
3.6%
석남역 1
 
1.8%
운연역 1
 
1.8%
인천대공원역 1
 
1.8%
검단사거리역 1
 
1.8%
마전역 1
 
1.8%
완정역 1
 
1.8%
독정역 1
 
1.8%
검암역 1
 
1.8%
검바위역 1
 
1.8%
Other values (45) 45
80.4%
2023-12-12T23:35:21.453591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56
 
21.6%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (90) 148
57.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 259
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
 
21.6%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (90) 148
57.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 259
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
 
21.6%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (90) 148
57.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 259
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
56
 
21.6%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (90) 148
57.1%

전화번호
Text

MISSING 

Distinct56
Distinct (%)100.0%
Missing3
Missing (%)5.1%
Memory size604.0 B
2023-12-12T23:35:21.740763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters672
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row032-710-9105
2nd row032-515-9104
3rd row032-519-3122
4th row032-541-3113
5th row032-546-3151
ValueCountFrequency (%)
032-515-9104 1
 
1.8%
032-519-3122 1
 
1.8%
032-451-4303 1
 
1.8%
032-451-4304 1
 
1.8%
032-451-4305 1
 
1.8%
032-451-4306 1
 
1.8%
032-451-4307 1
 
1.8%
032-451-4308 1
 
1.8%
032-451-4309 1
 
1.8%
032-451-4310 1
 
1.8%
Other values (46) 46
82.1%
2023-12-12T23:35:22.189514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 121
18.0%
- 112
16.7%
1 96
14.3%
2 84
12.5%
4 83
12.4%
0 73
10.9%
5 58
8.6%
8 18
 
2.7%
6 11
 
1.6%
9 9
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 560
83.3%
Dash Punctuation 112
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 121
21.6%
1 96
17.1%
2 84
15.0%
4 83
14.8%
0 73
13.0%
5 58
10.4%
8 18
 
3.2%
6 11
 
2.0%
9 9
 
1.6%
7 7
 
1.2%
Dash Punctuation
ValueCountFrequency (%)
- 112
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 672
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 121
18.0%
- 112
16.7%
1 96
14.3%
2 84
12.5%
4 83
12.4%
0 73
10.9%
5 58
8.6%
8 18
 
2.7%
6 11
 
1.6%
9 9
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 672
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 121
18.0%
- 112
16.7%
1 96
14.3%
2 84
12.5%
4 83
12.4%
0 73
10.9%
5 58
8.6%
8 18
 
2.7%
6 11
 
1.6%
9 9
 
1.3%

주 소
Text

MISSING 

Distinct56
Distinct (%)100.0%
Missing3
Missing (%)5.1%
Memory size604.0 B
2023-12-12T23:35:22.524462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length21.928571
Min length16

Characters and Unicode

Total characters1228
Distinct characters83
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row인천광역시 계양구 다남로 24
2nd row인천광역시 계양구 장제로 1136
3rd row인천광역시 계양구 장제로 지하 992
4th row인천광역시 계양구 장제로 지하 875
5th row인천광역시 계양구 경명대로 지하 1089
ValueCountFrequency (%)
인천광역시 56
19.7%
지하 33
 
11.6%
서구 17
 
6.0%
연수구 12
 
4.2%
남동구 10
 
3.5%
계양구 7
 
2.5%
부평구 6
 
2.1%
구월로 5
 
1.8%
경원대로 5
 
1.8%
남구 4
 
1.4%
Other values (97) 129
45.4%
2023-12-12T23:35:22.972156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
228
18.6%
66
 
5.4%
63
 
5.1%
61
 
5.0%
60
 
4.9%
60
 
4.9%
60
 
4.9%
56
 
4.6%
37
 
3.0%
34
 
2.8%
Other values (73) 503
41.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 789
64.3%
Space Separator 228
 
18.6%
Decimal Number 156
 
12.7%
Close Punctuation 27
 
2.2%
Open Punctuation 27
 
2.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
8.4%
63
 
8.0%
61
 
7.7%
60
 
7.6%
60
 
7.6%
60
 
7.6%
56
 
7.1%
37
 
4.7%
34
 
4.3%
33
 
4.2%
Other values (59) 259
32.8%
Decimal Number
ValueCountFrequency (%)
2 21
13.5%
1 19
12.2%
6 17
10.9%
8 17
10.9%
7 16
10.3%
3 15
9.6%
5 14
9.0%
9 14
9.0%
4 13
8.3%
0 10
6.4%
Space Separator
ValueCountFrequency (%)
228
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 789
64.3%
Common 439
35.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
8.4%
63
 
8.0%
61
 
7.7%
60
 
7.6%
60
 
7.6%
60
 
7.6%
56
 
7.1%
37
 
4.7%
34
 
4.3%
33
 
4.2%
Other values (59) 259
32.8%
Common
ValueCountFrequency (%)
228
51.9%
) 27
 
6.2%
( 27
 
6.2%
2 21
 
4.8%
1 19
 
4.3%
6 17
 
3.9%
8 17
 
3.9%
7 16
 
3.6%
3 15
 
3.4%
5 14
 
3.2%
Other values (4) 38
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 789
64.3%
ASCII 439
35.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
228
51.9%
) 27
 
6.2%
( 27
 
6.2%
2 21
 
4.8%
1 19
 
4.3%
6 17
 
3.9%
8 17
 
3.9%
7 16
 
3.6%
3 15
 
3.4%
5 14
 
3.2%
Other values (4) 38
 
8.7%
Hangul
ValueCountFrequency (%)
66
 
8.4%
63
 
8.0%
61
 
7.7%
60
 
7.6%
60
 
7.6%
60
 
7.6%
56
 
7.1%
37
 
4.7%
34
 
4.3%
33
 
4.2%
Other values (59) 259
32.8%

Correlations

2023-12-12T23:35:23.070321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분역명전화번호주 소
구분1.0000.0001.0001.000
역명0.0001.0001.0001.000
전화번호1.0001.0001.0001.000
주 소1.0001.0001.0001.000

Missing values

2023-12-12T23:35:19.910419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:35:20.016549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:35:20.118997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분역명전화번호주 소
01호선계양역032-710-9105인천광역시 계양구 다남로 24
11호선귤현역032-515-9104인천광역시 계양구 장제로 1136
21호선박촌역032-519-3122인천광역시 계양구 장제로 지하 992
31호선임학역032-541-3113인천광역시 계양구 장제로 지하 875
41호선계산역032-546-3151인천광역시 계양구 경명대로 지하 1089
51호선경인교대입구역032-553-3394인천광역시 계양구 계양대로 지하 162
61호선작전역032-543-3116인천광역시 계양구 계양대로 지하 73
71호선갈산역032-511-4245인천광역시 부평구 부평대로 지하 286
81호선부평구청역032-513-3118인천광역시 부평구 부평대로 지하 189
91호선부평시장역032-512-3119인천광역시 부평구 부평대로 지하 69
구분역명전화번호주 소
492호선인천시청역032-451-4321인천광역시 남동구 구월로 지하 99 (간석동)
502호선석천사거리역032-451-4322인천광역시 남동구 구월로 지하 181 (구월동)
512호선모래내시장역032-451-4323인천광역시 남동구 구월로 지하 255 (구월동)
522호선만수역032-451-4324인천광역시 남동구 구월로 지하 367 (만수동)
532호선남동구청역032-451-4325인천광역시 남동구 인주대로 지하 889 (만수동)
542호선인천대공원역032-451-4326인천광역시 남동구 수인로 3677 (장수동)
552호선운연역032-451-4327인천광역시 남동구 매소홀로 1229 (운연동)
56<NA><NA><NA>
57<NA><NA><NA>
58<NA><NA><NA>

Duplicate rows

Most frequently occurring

구분역명전화번호주 소# duplicates
0<NA><NA><NA>3