Overview

Dataset statistics

Number of variables4
Number of observations57
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 KiB
Average record size in memory35.3 B

Variable types

Categorical1
Text3

Dataset

Description인천교통공사에서 운영중인 인천지하철 1호선, 인천지하철 2호선 역사의 외국어 표기명으로 사용 외국어는 국어, 한자, 영어 현황입니다. (필드정보는 호선, 역사명, 한자, 영문명 입니다.)
URLhttps://www.data.go.kr/data/15043808/fileData.do

Alerts

한 글 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:32:42.367252
Analysis finished2023-12-12 14:32:42.827637
Duration0.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Categorical

Distinct2
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size588.0 B
1
30 
2
27 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 30
52.6%
2 27
47.4%

Length

2023-12-12T23:32:42.906332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:32:43.009345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 30
52.6%
2 27
47.4%

한 글
Text

UNIQUE 

Distinct57
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-12T23:32:43.220984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length4.5087719
Min length2

Characters and Unicode

Total characters257
Distinct characters100
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)100.0%

Sample

1st row계양
2nd row귤현
3rd row박촌
4th row임학
5th row계산
ValueCountFrequency (%)
인천시청 2
 
3.5%
계양 1
 
1.8%
국제업무지구 1
 
1.8%
왕길 1
 
1.8%
검단사거리 1
 
1.8%
마전 1
 
1.8%
완정 1
 
1.8%
독정 1
 
1.8%
검암 1
 
1.8%
검바위 1
 
1.8%
Other values (46) 46
80.7%
2023-12-12T23:32:43.645088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
 
21.0%
8
 
3.1%
8
 
3.1%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
5
 
1.9%
5
 
1.9%
5
 
1.9%
Other values (90) 148
57.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 203
79.0%
Space Separator 54
 
21.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
3.9%
8
 
3.9%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
5
 
2.5%
5
 
2.5%
5
 
2.5%
4
 
2.0%
Other values (89) 144
70.9%
Space Separator
ValueCountFrequency (%)
54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 203
79.0%
Common 54
 
21.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
3.9%
8
 
3.9%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
5
 
2.5%
5
 
2.5%
5
 
2.5%
4
 
2.0%
Other values (89) 144
70.9%
Common
ValueCountFrequency (%)
54
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 203
79.0%
ASCII 54
 
21.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
54
100.0%
Hangul
ValueCountFrequency (%)
8
 
3.9%
8
 
3.9%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
5
 
2.5%
5
 
2.5%
5
 
2.5%
4
 
2.0%
Other values (89) 144
70.9%
Distinct51
Distinct (%)89.5%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-12T23:32:43.916229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length4.122807
Min length1

Characters and Unicode

Total characters235
Distinct characters115
Distinct categories5 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)86.0%

Sample

1st row桂陽
2nd row橘峴
3rd row朴村
4th row林鶴
5th row桂山
ValueCountFrequency (%)
仁川市廳 2
 
3.5%
知識情報團地 1
 
1.8%
거북市場 1
 
1.8%
黔丹四거리 1
 
1.8%
麻田 1
 
1.8%
完井 1
 
1.8%
篤亭 1
 
1.8%
黔岩 1
 
1.8%
아시아드競技場 1
 
1.8%
公村四거리 1
 
1.8%
Other values (46) 46
80.7%
2023-12-12T23:32:44.309125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
5.1%
9
 
3.8%
8
 
3.4%
7
 
3.0%
7
 
3.0%
( 7
 
3.0%
) 6
 
2.6%
6
 
2.6%
6
 
2.6%
5
 
2.1%
Other values (105) 162
68.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 209
88.9%
Space Separator 12
 
5.1%
Open Punctuation 7
 
3.0%
Close Punctuation 6
 
2.6%
Uppercase Letter 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
4.3%
8
 
3.8%
7
 
3.3%
7
 
3.3%
6
 
2.9%
6
 
2.9%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
Other values (101) 149
71.3%
Space Separator
ValueCountFrequency (%)
12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Uppercase Letter
ValueCountFrequency (%)
J 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 182
77.4%
Hangul 27
 
11.5%
Common 25
 
10.6%
Latin 1
 
0.4%

Most frequent character per script

Han
ValueCountFrequency (%)
9
 
4.9%
8
 
4.4%
7
 
3.8%
7
 
3.8%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (85) 126
69.2%
Hangul
ValueCountFrequency (%)
6
22.2%
6
22.2%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (6) 6
22.2%
Common
ValueCountFrequency (%)
12
48.0%
( 7
28.0%
) 6
24.0%
Latin
ValueCountFrequency (%)
J 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
CJK 181
77.0%
Hangul 27
 
11.5%
ASCII 26
 
11.1%
CJK Compat Ideographs 1
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12
46.2%
( 7
26.9%
) 6
23.1%
J 1
 
3.8%
CJK
ValueCountFrequency (%)
9
 
5.0%
8
 
4.4%
7
 
3.9%
7
 
3.9%
5
 
2.8%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (84) 125
69.1%
Hangul
ValueCountFrequency (%)
6
22.2%
6
22.2%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (6) 6
22.2%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Distinct56
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-12T23:32:44.619995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length29
Mean length14.491228
Min length4

Characters and Unicode

Total characters826
Distinct characters54
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)96.5%

Sample

1st rowGyeyang
2nd rowGyulhyeon
3rd rowBakchon
4th rowImhak
5th rowGyesan
ValueCountFrequency (%)
incheon 7
 
6.0%
market 5
 
4.3%
office 3
 
2.6%
sageori 3
 
2.6%
geomdan 3
 
2.6%
city 3
 
2.6%
complex 3
 
2.6%
park 3
 
2.6%
gajeong 2
 
1.7%
univ 2
 
1.7%
Other values (74) 82
70.7%
2023-12-12T23:32:45.052943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 83
 
10.0%
e 73
 
8.8%
a 67
 
8.1%
o 65
 
7.9%
59
 
7.1%
i 33
 
4.0%
u 32
 
3.9%
g 30
 
3.6%
r 29
 
3.5%
t 29
 
3.5%
Other values (44) 326
39.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 626
75.8%
Uppercase Letter 116
 
14.0%
Space Separator 59
 
7.1%
Close Punctuation 7
 
0.8%
Open Punctuation 7
 
0.8%
Other Punctuation 5
 
0.6%
Dash Punctuation 4
 
0.5%
Decimal Number 1
 
0.1%
Final Punctuation 1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 83
13.3%
e 73
11.7%
a 67
10.7%
o 65
10.4%
i 33
 
5.3%
u 32
 
5.1%
g 30
 
4.8%
r 29
 
4.6%
t 29
 
4.6%
l 23
 
3.7%
Other values (15) 162
25.9%
Uppercase Letter
ValueCountFrequency (%)
G 18
15.5%
C 15
12.9%
S 12
10.3%
I 12
10.3%
M 9
 
7.8%
B 8
 
6.9%
J 5
 
4.3%
D 5
 
4.3%
W 5
 
4.3%
O 4
 
3.4%
Other values (10) 23
19.8%
Other Punctuation
ValueCountFrequency (%)
' 3
60.0%
& 1
 
20.0%
. 1
 
20.0%
Space Separator
ValueCountFrequency (%)
59
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 742
89.8%
Common 84
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 83
 
11.2%
e 73
 
9.8%
a 67
 
9.0%
o 65
 
8.8%
i 33
 
4.4%
u 32
 
4.3%
g 30
 
4.0%
r 29
 
3.9%
t 29
 
3.9%
l 23
 
3.1%
Other values (35) 278
37.5%
Common
ValueCountFrequency (%)
59
70.2%
) 7
 
8.3%
( 7
 
8.3%
- 4
 
4.8%
' 3
 
3.6%
& 1
 
1.2%
. 1
 
1.2%
1 1
 
1.2%
1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 825
99.9%
Punctuation 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 83
 
10.1%
e 73
 
8.8%
a 67
 
8.1%
o 65
 
7.9%
59
 
7.2%
i 33
 
4.0%
u 32
 
3.9%
g 30
 
3.6%
r 29
 
3.5%
t 29
 
3.5%
Other values (43) 325
39.4%
Punctuation
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T23:32:45.165104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선한 글漢 字로마字
호선1.0001.0000.0000.000
한 글1.0001.0001.0001.000
漢 字0.0001.0001.0001.000
로마字0.0001.0001.0001.000

Missing values

2023-12-12T23:32:42.681925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:32:42.785741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선한 글漢 字로마字
01계양桂陽Gyeyang
11귤현橘峴Gyulhyeon
21박촌朴村Bakchon
31임학林鶴Imhak
41계산桂山Gyesan
51경인교대京仁敎大入口Gyeong-in Nat'l
61입구Univ. of Education
71작전鵲田Jakjeon
81갈산葛山Galsan
91부평구청富平區廳Bupyeong-gu Office
호선한 글漢 字로마字
472주안朱安Juan
482시민공원市民公園 (文化創作地帶)Citizens Park (Culture Creation Zone)
492석바위시장석바위市場Seokbawi Market
502인천시청仁川市廳Incheon City Hall
512석천사거리石泉四거리Seokcheon Sageori
522모래내시장모래내市場Moraenae Market
532만수萬壽Mansu
542남동구청南洞區廳Namdong-gu Office
552인천대공원仁川大公園Incheon Grand Park
562운연云宴 (西昌)Unyeon (Seochang)