Overview

Dataset statistics

Number of variables5
Number of observations300
Missing cells1
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.1 KiB
Average record size in memory41.4 B

Variable types

Numeric1
Text4

Dataset

Description한국철도역 역명에 대한 한자표기와 영문표기입니다. 이 데이터는 번호,한글,영어,한자(번체),주소 항목을 제공합니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15042115/fileData.do

Alerts

번호 has unique valuesUnique
역명 has unique valuesUnique
영문 has unique valuesUnique
주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 19:32:06.969838
Analysis finished2023-12-12 19:32:07.810352
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct300
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.5
Minimum1
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2023-12-13T04:32:07.915694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.95
Q175.75
median150.5
Q3225.25
95-th percentile285.05
Maximum300
Range299
Interquartile range (IQR)149.5

Descriptive statistics

Standard deviation86.746758
Coefficient of variation (CV)0.57639042
Kurtosis-1.2
Mean150.5
Median Absolute Deviation (MAD)75
Skewness0
Sum45150
Variance7525
MonotonicityStrictly increasing
2023-12-13T04:32:08.079436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
208 1
 
0.3%
206 1
 
0.3%
205 1
 
0.3%
204 1
 
0.3%
203 1
 
0.3%
202 1
 
0.3%
201 1
 
0.3%
200 1
 
0.3%
199 1
 
0.3%
Other values (290) 290
96.7%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
300 1
0.3%
299 1
0.3%
298 1
0.3%
297 1
0.3%
296 1
0.3%
295 1
0.3%
294 1
0.3%
293 1
0.3%
292 1
0.3%
291 1
0.3%

역명
Text

UNIQUE 

Distinct300
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T04:32:08.622322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.28
Min length2

Characters and Unicode

Total characters684
Distinct characters195
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique300 ?
Unique (%)100.0%

Sample

1st row가남
2nd row가수원
3rd row가야
4th row각계
5th row감곡
ValueCountFrequency (%)
가남 1
 
0.3%
원주 1
 
0.3%
원동 1
 
0.3%
웅천 1
 
0.3%
울산 1
 
0.3%
우보 1
 
0.3%
용산 1
 
0.3%
용동 1
 
0.3%
용궁 1
 
0.3%
왜관 1
 
0.3%
Other values (290) 290
96.7%
2023-12-13T04:32:09.297371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
31
 
4.5%
23
 
3.4%
20
 
2.9%
19
 
2.8%
17
 
2.5%
14
 
2.0%
14
 
2.0%
13
 
1.9%
12
 
1.8%
11
 
1.6%
Other values (185) 510
74.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 680
99.4%
Close Punctuation 2
 
0.3%
Open Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
4.6%
23
 
3.4%
20
 
2.9%
19
 
2.8%
17
 
2.5%
14
 
2.1%
14
 
2.1%
13
 
1.9%
12
 
1.8%
11
 
1.6%
Other values (183) 506
74.4%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 680
99.4%
Common 4
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
4.6%
23
 
3.4%
20
 
2.9%
19
 
2.8%
17
 
2.5%
14
 
2.1%
14
 
2.1%
13
 
1.9%
12
 
1.8%
11
 
1.6%
Other values (183) 506
74.4%
Common
ValueCountFrequency (%)
) 2
50.0%
( 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 680
99.4%
ASCII 4
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
31
 
4.6%
23
 
3.4%
20
 
2.9%
19
 
2.8%
17
 
2.5%
14
 
2.1%
14
 
2.1%
13
 
1.9%
12
 
1.8%
11
 
1.6%
Other values (183) 506
74.4%
ASCII
ValueCountFrequency (%)
) 2
50.0%
( 2
50.0%

영문
Text

UNIQUE 

Distinct300
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T04:32:09.653220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length16
Mean length7.5633333
Min length3

Characters and Unicode

Total characters2269
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique300 ?
Unique (%)100.0%

Sample

1st rowGanam
2nd rowGasuwon
3rd rowGaya
4th rowGakgye
5th rowGamgok
ValueCountFrequency (%)
ganam 1
 
0.3%
yongdong 1
 
0.3%
wonju 1
 
0.3%
wolleung 1
 
0.3%
wondong 1
 
0.3%
ungcheon 1
 
0.3%
ulsan 1
 
0.3%
ubo 1
 
0.3%
yongsan 1
 
0.3%
yonggung 1
 
0.3%
Other values (294) 294
96.7%
2023-12-13T04:32:10.198241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 357
15.7%
o 260
 
11.5%
g 225
 
9.9%
a 207
 
9.1%
e 207
 
9.1%
u 99
 
4.4%
h 73
 
3.2%
i 68
 
3.0%
s 64
 
2.8%
y 55
 
2.4%
Other values (36) 654
28.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1935
85.3%
Uppercase Letter 311
 
13.7%
Dash Punctuation 13
 
0.6%
Space Separator 5
 
0.2%
Close Punctuation 2
 
0.1%
Open Punctuation 2
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 357
18.4%
o 260
13.4%
g 225
11.6%
a 207
10.7%
e 207
10.7%
u 99
 
5.1%
h 73
 
3.8%
i 68
 
3.5%
s 64
 
3.3%
y 55
 
2.8%
Other values (12) 320
16.5%
Uppercase Letter
ValueCountFrequency (%)
G 45
14.5%
S 44
14.1%
J 33
10.6%
B 25
8.0%
H 23
7.4%
Y 20
 
6.4%
D 18
 
5.8%
M 17
 
5.5%
C 16
 
5.1%
N 14
 
4.5%
Other values (9) 56
18.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Other Punctuation
ValueCountFrequency (%)
' 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2246
99.0%
Common 23
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 357
15.9%
o 260
11.6%
g 225
 
10.0%
a 207
 
9.2%
e 207
 
9.2%
u 99
 
4.4%
h 73
 
3.3%
i 68
 
3.0%
s 64
 
2.8%
y 55
 
2.4%
Other values (31) 631
28.1%
Common
ValueCountFrequency (%)
- 13
56.5%
5
 
21.7%
) 2
 
8.7%
( 2
 
8.7%
' 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2269
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 357
15.7%
o 260
 
11.5%
g 225
 
9.9%
a 207
 
9.1%
e 207
 
9.1%
u 99
 
4.4%
h 73
 
3.2%
i 68
 
3.0%
s 64
 
2.8%
y 55
 
2.4%
Other values (36) 654
28.8%

한자
Text

Distinct298
Distinct (%)99.7%
Missing1
Missing (%)0.3%
Memory size2.5 KiB
2023-12-13T04:32:10.583849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.2842809
Min length1

Characters and Unicode

Total characters683
Distinct characters314
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique297 ?
Unique (%)99.3%

Sample

1st row加南
2nd row佳水院
3rd row伽倻
4th row覺溪
5th row甘谷
ValueCountFrequency (%)
2
 
0.7%
玉山 1
 
0.3%
元陵 1
 
0.3%
院洞 1
 
0.3%
熊川 1
 
0.3%
蔚山 1
 
0.3%
友保 1
 
0.3%
龍山 1
 
0.3%
龍宮 1
 
0.3%
元竹 1
 
0.3%
Other values (288) 288
96.3%
2023-12-13T04:32:11.078578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
3.2%
19
 
2.8%
16
 
2.3%
13
 
1.9%
13
 
1.9%
13
 
1.9%
10
 
1.5%
10
 
1.5%
10
 
1.5%
9
 
1.3%
Other values (304) 548
80.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 675
98.8%
Close Punctuation 3
 
0.4%
Open Punctuation 3
 
0.4%
Dash Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
3.3%
19
 
2.8%
16
 
2.4%
13
 
1.9%
13
 
1.9%
13
 
1.9%
10
 
1.5%
10
 
1.5%
10
 
1.5%
9
 
1.3%
Other values (301) 540
80.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 672
98.4%
Common 8
 
1.2%
Hangul 3
 
0.4%

Most frequent character per script

Han
ValueCountFrequency (%)
22
 
3.3%
19
 
2.8%
16
 
2.4%
13
 
1.9%
13
 
1.9%
13
 
1.9%
10
 
1.5%
10
 
1.5%
10
 
1.5%
9
 
1.3%
Other values (298) 537
79.9%
Common
ValueCountFrequency (%)
) 3
37.5%
( 3
37.5%
- 2
25.0%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
CJK 643
94.1%
CJK Compat Ideographs 29
 
4.2%
ASCII 8
 
1.2%
Hangul 3
 
0.4%

Most frequent character per block

CJK
ValueCountFrequency (%)
22
 
3.4%
19
 
3.0%
16
 
2.5%
13
 
2.0%
13
 
2.0%
13
 
2.0%
10
 
1.6%
10
 
1.6%
10
 
1.6%
9
 
1.4%
Other values (276) 508
79.0%
ASCII
ValueCountFrequency (%)
) 3
37.5%
( 3
37.5%
- 2
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
3
 
10.3%
3
 
10.3%
2
 
6.9%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (12) 12
41.4%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

주소
Text

UNIQUE 

Distinct300
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T04:32:11.359284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length26.5
Mean length18.576667
Min length12

Characters and Unicode

Total characters5573
Distinct characters251
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique300 ?
Unique (%)100.0%

Sample

1st row경기도 여주시 가남읍 태평리
2nd row대전 서구 벌곡로 1324(가수원동)
3rd row부산 부산진구 백양대로 91
4th row충북 영동군 심천면 각계길 55
5th row전북 정읍시 감곡면 호남철로 501
ValueCountFrequency (%)
경북 55
 
3.9%
전남 36
 
2.5%
충남 27
 
1.9%
경남 26
 
1.8%
충북 25
 
1.8%
강원도 24
 
1.7%
전북 21
 
1.5%
경기도 20
 
1.4%
강원 14
 
1.0%
정선군 11
 
0.8%
Other values (824) 1160
81.7%
2023-12-13T04:32:11.795194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1128
 
20.2%
1 203
 
3.6%
173
 
3.1%
169
 
3.0%
138
 
2.5%
128
 
2.3%
127
 
2.3%
122
 
2.2%
2 120
 
2.2%
119
 
2.1%
Other values (241) 3146
56.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3397
61.0%
Space Separator 1128
 
20.2%
Decimal Number 921
 
16.5%
Dash Punctuation 74
 
1.3%
Close Punctuation 25
 
0.4%
Open Punctuation 25
 
0.4%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
173
 
5.1%
169
 
5.0%
138
 
4.1%
128
 
3.8%
127
 
3.7%
122
 
3.6%
119
 
3.5%
86
 
2.5%
84
 
2.5%
82
 
2.4%
Other values (226) 2169
63.9%
Decimal Number
ValueCountFrequency (%)
1 203
22.0%
2 120
13.0%
3 88
9.6%
5 86
9.3%
7 76
 
8.3%
6 74
 
8.0%
4 72
 
7.8%
8 69
 
7.5%
0 67
 
7.3%
9 66
 
7.2%
Space Separator
ValueCountFrequency (%)
1128
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 74
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3397
61.0%
Common 2176
39.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
173
 
5.1%
169
 
5.0%
138
 
4.1%
128
 
3.8%
127
 
3.7%
122
 
3.6%
119
 
3.5%
86
 
2.5%
84
 
2.5%
82
 
2.4%
Other values (226) 2169
63.9%
Common
ValueCountFrequency (%)
1128
51.8%
1 203
 
9.3%
2 120
 
5.5%
3 88
 
4.0%
5 86
 
4.0%
7 76
 
3.5%
- 74
 
3.4%
6 74
 
3.4%
4 72
 
3.3%
8 69
 
3.2%
Other values (5) 186
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3397
61.0%
ASCII 2176
39.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1128
51.8%
1 203
 
9.3%
2 120
 
5.5%
3 88
 
4.0%
5 86
 
4.0%
7 76
 
3.5%
- 74
 
3.4%
6 74
 
3.4%
4 72
 
3.3%
8 69
 
3.2%
Other values (5) 186
 
8.5%
Hangul
ValueCountFrequency (%)
173
 
5.1%
169
 
5.0%
138
 
4.1%
128
 
3.8%
127
 
3.7%
122
 
3.6%
119
 
3.5%
86
 
2.5%
84
 
2.5%
82
 
2.4%
Other values (226) 2169
63.9%

Interactions

2023-12-13T04:32:07.478128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T04:32:07.622732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:32:07.762174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호역명영문한자주소
01가남Ganam加南경기도 여주시 가남읍 태평리
12가수원Gasuwon佳水院대전 서구 벌곡로 1324(가수원동)
23가야Gaya伽倻부산 부산진구 백양대로 91
34각계Gakgye覺溪충북 영동군 심천면 각계길 55
45감곡Gamgok甘谷전북 정읍시 감곡면 호남철로 501
56감곡장호원GangokJanghowon甘谷長湖院충북 음성군 감곡면 왕장리 312-2
67강경Ganggyeong江景충남 논산시 강경읍 대흥로 1
78강구Ganggu江口경상북도 영덕군 강구면 강산로 67
89강릉Gangneung江陵강원도 강릉시 용지로 176
910개운Gaeun開雲전남 순천시 서면 개운길 30
번호역명영문한자주소
290291화명Hwamyeong華明부산 북구 학사로 135(화명동)
291292화본Hwabon花本경북 군위군 산성면 산성가음로 711-9
292293화산Hwasan花山경북 영천시 화산면 장수로 917-10
293294화순Hwasun和順전남 화순군 화순읍 벽라리 507
294295화양Hwayang華陽충남 홍성군 금마면 화양리 181
295296황간Hwanggan黃澗충북 영동군 황간면 하옥포2길 14
296297횡성Hoengseong橫城강원도 횡성군 횡성읍 덕고로 591
297298횡천Hoengcheon橫川경남 하동군 횡천면 중마길 277
298299효자Hyoja孝子경북 포항시 남구 새천년대로 289
299300효천Hyocheon孝泉광주시 남구 효천길 5