Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells7591
Missing cells (%)9.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory703.1 KiB
Average record size in memory72.0 B

Variable types

Text6
Categorical2

Dataset

Description제주도 옛 지리지명에 대한 현황 자료로 옛 지리지명, 유래, 소재지, 현 지리지명, 표준어, 출처 등에 대한 정보를 제공합니다.
Author제주특별자치도
URLhttps://www.data.go.kr/data/15111492/fileData.do

Alerts

장소구분 is highly imbalanced (67.8%)Imbalance
현지리지명 has 7541 (75.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:59:16.028849
Analysis finished2023-12-12 14:59:19.187752
Duration3.16 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct8570
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:59:19.442418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length31
Mean length4.3609
Min length1

Characters and Unicode

Total characters43609
Distinct characters1242
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7898 ?
Unique (%)79.0%

Sample

1st row성당골목
2nd row볼레낭동산&볼레낭알
3rd row자금이왓
4th row안새미오름&명도암오름
5th row섯윤서
ValueCountFrequency (%)
상동 36
 
0.4%
하동 33
 
0.3%
서동 29
 
0.3%
중동 27
 
0.3%
동동 24
 
0.2%
비석거리 21
 
0.2%
뒷동산 17
 
0.2%
포제동산 17
 
0.2%
솔대왓 16
 
0.2%
본동 16
 
0.2%
Other values (8546) 9801
97.6%
2023-12-12T23:59:20.212182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2279
 
5.2%
& 1578
 
3.6%
1366
 
3.1%
1145
 
2.6%
1077
 
2.5%
939
 
2.2%
822
 
1.9%
551
 
1.3%
490
 
1.1%
462
 
1.1%
Other values (1232) 32900
75.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40864
93.7%
Other Punctuation 1880
 
4.3%
Open Punctuation 326
 
0.7%
Close Punctuation 326
 
0.7%
Space Separator 189
 
0.4%
Decimal Number 18
 
< 0.1%
Other Number 5
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2279
 
5.6%
1366
 
3.3%
1145
 
2.8%
1077
 
2.6%
939
 
2.3%
822
 
2.0%
551
 
1.3%
490
 
1.2%
462
 
1.1%
458
 
1.1%
Other values (1214) 31275
76.5%
Decimal Number
ValueCountFrequency (%)
1 8
44.4%
3 4
22.2%
0 2
 
11.1%
2 2
 
11.1%
5 1
 
5.6%
4 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
& 1578
83.9%
/ 301
 
16.0%
. 1
 
0.1%
Other Number
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Open Punctuation
ValueCountFrequency (%)
( 318
97.5%
[ 8
 
2.5%
Close Punctuation
ValueCountFrequency (%)
) 318
97.5%
] 8
 
2.5%
Space Separator
ValueCountFrequency (%)
189
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39985
91.7%
Common 2745
 
6.3%
Han 879
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2279
 
5.7%
1366
 
3.4%
1145
 
2.9%
1077
 
2.7%
939
 
2.3%
822
 
2.1%
551
 
1.4%
490
 
1.2%
462
 
1.2%
458
 
1.1%
Other values (870) 30396
76.0%
Han
ValueCountFrequency (%)
60
 
6.8%
37
 
4.2%
27
 
3.1%
19
 
2.2%
17
 
1.9%
17
 
1.9%
16
 
1.8%
14
 
1.6%
14
 
1.6%
12
 
1.4%
Other values (334) 646
73.5%
Common
ValueCountFrequency (%)
& 1578
57.5%
( 318
 
11.6%
) 318
 
11.6%
/ 301
 
11.0%
189
 
6.9%
[ 8
 
0.3%
] 8
 
0.3%
1 8
 
0.3%
3 4
 
0.1%
2
 
0.1%
Other values (8) 11
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39985
91.7%
ASCII 2739
 
6.3%
CJK 859
 
2.0%
CJK Compat Ideographs 20
 
< 0.1%
Enclosed Alphanum 5
 
< 0.1%
Box Drawing 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2279
 
5.7%
1366
 
3.4%
1145
 
2.9%
1077
 
2.7%
939
 
2.3%
822
 
2.1%
551
 
1.4%
490
 
1.2%
462
 
1.2%
458
 
1.1%
Other values (870) 30396
76.0%
ASCII
ValueCountFrequency (%)
& 1578
57.6%
( 318
 
11.6%
) 318
 
11.6%
/ 301
 
11.0%
189
 
6.9%
[ 8
 
0.3%
] 8
 
0.3%
1 8
 
0.3%
3 4
 
0.1%
0 2
 
0.1%
Other values (4) 5
 
0.2%
CJK
ValueCountFrequency (%)
60
 
7.0%
37
 
4.3%
27
 
3.1%
19
 
2.2%
17
 
2.0%
17
 
2.0%
16
 
1.9%
14
 
1.6%
14
 
1.6%
12
 
1.4%
Other values (319) 626
72.9%
CJK Compat Ideographs
ValueCountFrequency (%)
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
2
 
10.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
1
 
5.0%
Other values (5) 5
25.0%
Enclosed Alphanum
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Box Drawing
ValueCountFrequency (%)
1
100.0%

장소구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
8867 
자연부락명
1132 
자연부락명
 
1

Length

Max length6
Median length4
Mean length4.1134
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 8867
88.7%
자연부락명 1132
 
11.3%
자연부락명 1
 
< 0.1%

Length

2023-12-12T23:59:20.478521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:20.590343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 8867
88.7%
자연부락명 1133
 
11.3%

의미
Text

Distinct9602
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:59:20.999728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1024
Median length549
Mean length60.2155
Min length3

Characters and Unicode

Total characters602155
Distinct characters2184
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9408 ?
Unique (%)94.1%

Sample

1st row성당이 있는 곳의 좁은 골목길이다.
2nd row볼레낭'은 '보리수나무'의 사투리이다. '볼레낭'이 군락을 이룬 높은 곳이라 해서 '볼레낭동산', 그 밑 바닷가라 해서 '볼레낭알'이다. 안비양 동남쪽 300미터 지점에 있다.
3rd row애월읍 수산리 300번지 일대의 밭 이름이다.
4th row봉개동 산2번지 일대의 오름으로 조리새미라는 샘물을 중심으로 마을과 가까운 오름을 안새미, 먼 곳을 밧새미라고 부른다. 두 오름을 합쳐 형제봉이라고도 부른다. 제주군읍지에는 형봉(兄峰)이라고 하였다. 밧새미(弟峰)보다 5m가량 높다. 이 오름 북사면에 명도암선생유허비가 있다. 일제강점기에 판 갱도진지가 있으며, 이 갱도진지에서는 4·3 당시 명도암 마을 주민들 일부와 용강동 주민도 피난생활을 하기도 했다.
5th row구좌읍 김녕리 일대의 지명이다.
ValueCountFrequency (%)
이름이다 2248
 
1.6%
있는 1876
 
1.4%
제주 1842
 
1.3%
방언이다 1675
 
1.2%
있다 1504
 
1.1%
붙은 1417
 
1.0%
1361
 
1.0%
일대의 1346
 
1.0%
데서 1185
 
0.9%
하여 1073
 
0.8%
Other values (38375) 122412
88.7%
2023-12-12T23:59:21.627070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
128196
 
21.3%
27962
 
4.6%
19978
 
3.3%
. 16508
 
2.7%
' 14638
 
2.4%
11509
 
1.9%
9377
 
1.6%
8968
 
1.5%
7166
 
1.2%
7044
 
1.2%
Other values (2174) 350809
58.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 413902
68.7%
Space Separator 128196
 
21.3%
Other Punctuation 35835
 
6.0%
Decimal Number 14989
 
2.5%
Open Punctuation 2633
 
0.4%
Close Punctuation 2630
 
0.4%
Initial Punctuation 1201
 
0.2%
Final Punctuation 1147
 
0.2%
Dash Punctuation 783
 
0.1%
Lowercase Letter 543
 
0.1%
Other values (3) 296
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27962
 
6.8%
19978
 
4.8%
11509
 
2.8%
9377
 
2.3%
8968
 
2.2%
7166
 
1.7%
7044
 
1.7%
6691
 
1.6%
6641
 
1.6%
6531
 
1.6%
Other values (2095) 302035
73.0%
Other Punctuation
ValueCountFrequency (%)
. 16508
46.1%
' 14638
40.8%
, 3931
 
11.0%
" 276
 
0.8%
/ 235
 
0.7%
· 191
 
0.5%
: 25
 
0.1%
& 21
 
0.1%
! 6
 
< 0.1%
; 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
> 56
24.9%
< 41
18.2%
~ 38
16.9%
26
11.6%
+ 24
10.7%
19
 
8.4%
= 10
 
4.4%
7
 
3.1%
2
 
0.9%
1
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 3192
21.3%
2 1878
12.5%
0 1773
11.8%
3 1517
10.1%
4 1264
 
8.4%
5 1219
 
8.1%
9 1152
 
7.7%
6 1022
 
6.8%
8 1013
 
6.8%
7 959
 
6.4%
Lowercase Letter
ValueCountFrequency (%)
m 433
79.7%
k 90
 
16.6%
c 8
 
1.5%
a 3
 
0.6%
h 2
 
0.4%
t 2
 
0.4%
s 2
 
0.4%
r 1
 
0.2%
1
 
0.2%
e 1
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
K 7
30.4%
C 4
17.4%
M 4
17.4%
B 2
 
8.7%
D 1
 
4.3%
Y 1
 
4.3%
S 1
 
4.3%
N 1
 
4.3%
H 1
 
4.3%
V 1
 
4.3%
Other Symbol
ValueCountFrequency (%)
18
37.5%
15
31.2%
5
 
10.4%
3
 
6.2%
2
 
4.2%
° 2
 
4.2%
1
 
2.1%
1
 
2.1%
1
 
2.1%
Open Punctuation
ValueCountFrequency (%)
( 2449
93.0%
[ 98
 
3.7%
48
 
1.8%
35
 
1.3%
2
 
0.1%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2446
93.0%
] 97
 
3.7%
48
 
1.8%
35
 
1.3%
2
 
0.1%
2
 
0.1%
Initial Punctuation
ValueCountFrequency (%)
1189
99.0%
12
 
1.0%
Final Punctuation
ValueCountFrequency (%)
1135
99.0%
12
 
1.0%
Space Separator
ValueCountFrequency (%)
128196
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 783
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 410315
68.1%
Common 187687
31.2%
Han 3587
 
0.6%
Latin 566
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27962
 
6.8%
19978
 
4.9%
11509
 
2.8%
9377
 
2.3%
8968
 
2.2%
7166
 
1.7%
7044
 
1.7%
6691
 
1.6%
6641
 
1.6%
6531
 
1.6%
Other values (1240) 298448
72.7%
Han
ValueCountFrequency (%)
135
 
3.8%
77
 
2.1%
69
 
1.9%
68
 
1.9%
63
 
1.8%
56
 
1.6%
51
 
1.4%
45
 
1.3%
38
 
1.1%
33
 
0.9%
Other values (845) 2952
82.3%
Common
ValueCountFrequency (%)
128196
68.3%
. 16508
 
8.8%
' 14638
 
7.8%
, 3931
 
2.1%
1 3192
 
1.7%
( 2449
 
1.3%
) 2446
 
1.3%
2 1878
 
1.0%
0 1773
 
0.9%
3 1517
 
0.8%
Other values (49) 11159
 
5.9%
Latin
ValueCountFrequency (%)
m 433
76.5%
k 90
 
15.9%
c 8
 
1.4%
K 7
 
1.2%
C 4
 
0.7%
M 4
 
0.7%
a 3
 
0.5%
B 2
 
0.4%
h 2
 
0.4%
t 2
 
0.4%
Other values (10) 11
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 410303
68.1%
ASCII 185435
30.8%
CJK 3471
 
0.6%
Punctuation 2350
 
0.4%
None 376
 
0.1%
CJK Compat Ideographs 116
 
< 0.1%
CJK Compat 40
 
< 0.1%
Math Operators 26
 
< 0.1%
Arrows 21
 
< 0.1%
Compat Jamo 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
128196
69.1%
. 16508
 
8.9%
' 14638
 
7.9%
, 3931
 
2.1%
1 3192
 
1.7%
( 2449
 
1.3%
) 2446
 
1.3%
2 1878
 
1.0%
0 1773
 
1.0%
3 1517
 
0.8%
Other values (40) 8907
 
4.8%
Hangul
ValueCountFrequency (%)
27962
 
6.8%
19978
 
4.9%
11509
 
2.8%
9377
 
2.3%
8968
 
2.2%
7166
 
1.7%
7044
 
1.7%
6691
 
1.6%
6641
 
1.6%
6531
 
1.6%
Other values (1234) 298436
72.7%
Punctuation
ValueCountFrequency (%)
1189
50.6%
1135
48.3%
12
 
0.5%
12
 
0.5%
2
 
0.1%
None
ValueCountFrequency (%)
· 191
50.8%
48
 
12.8%
48
 
12.8%
35
 
9.3%
35
 
9.3%
7
 
1.9%
2
 
0.5%
2
 
0.5%
2
 
0.5%
° 2
 
0.5%
Other values (4) 4
 
1.1%
CJK
ValueCountFrequency (%)
135
 
3.9%
77
 
2.2%
69
 
2.0%
68
 
2.0%
63
 
1.8%
56
 
1.6%
51
 
1.5%
45
 
1.3%
38
 
1.1%
33
 
1.0%
Other values (812) 2836
81.7%
Math Operators
ValueCountFrequency (%)
26
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
25
21.6%
19
16.4%
13
11.2%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
Other values (23) 32
27.6%
Arrows
ValueCountFrequency (%)
19
90.5%
2
 
9.5%
CJK Compat
ValueCountFrequency (%)
18
45.0%
15
37.5%
3
 
7.5%
2
 
5.0%
1
 
2.5%
1
 
2.5%
Geometric Shapes
ValueCountFrequency (%)
5
100.0%
Compat Jamo
ValueCountFrequency (%)
4
33.3%
3
25.0%
2
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
Distinct182
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T23:59:21.965804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.1054
Min length2

Characters and Unicode

Total characters31054
Distinct characters144
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row서홍동
2nd row조일리
3rd row수산리
4th row봉개동
5th row김녕리
ValueCountFrequency (%)
수산리 425
 
4.2%
유수암리 303
 
3.0%
사계리 268
 
2.7%
세화리 246
 
2.5%
봉개동 207
 
2.1%
평대리 197
 
2.0%
고성리 185
 
1.8%
김녕리 181
 
1.8%
한동리 178
 
1.8%
회천동 169
 
1.7%
Other values (172) 7641
76.4%
2023-12-12T23:59:22.505043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7421
23.9%
3063
 
9.9%
1017
 
3.3%
787
 
2.5%
670
 
2.2%
630
 
2.0%
577
 
1.9%
486
 
1.6%
464
 
1.5%
429
 
1.4%
Other values (134) 15510
49.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30395
97.9%
Decimal Number 590
 
1.9%
Space Separator 69
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7421
24.4%
3063
 
10.1%
1017
 
3.3%
787
 
2.6%
670
 
2.2%
630
 
2.1%
577
 
1.9%
486
 
1.6%
464
 
1.5%
429
 
1.4%
Other values (130) 14851
48.9%
Decimal Number
ValueCountFrequency (%)
1 375
63.6%
2 205
34.7%
3 10
 
1.7%
Space Separator
ValueCountFrequency (%)
69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30395
97.9%
Common 659
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7421
24.4%
3063
 
10.1%
1017
 
3.3%
787
 
2.6%
670
 
2.2%
630
 
2.1%
577
 
1.9%
486
 
1.6%
464
 
1.5%
429
 
1.4%
Other values (130) 14851
48.9%
Common
ValueCountFrequency (%)
1 375
56.9%
2 205
31.1%
69
 
10.5%
3 10
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30395
97.9%
ASCII 659
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7421
24.4%
3063
 
10.1%
1017
 
3.3%
787
 
2.6%
670
 
2.2%
630
 
2.1%
577
 
1.9%
486
 
1.6%
464
 
1.5%
429
 
1.4%
Other values (130) 14851
48.9%
ASCII
ValueCountFrequency (%)
1 375
56.9%
2 205
31.1%
69
 
10.5%
3 10
 
1.5%

현지리지명
Text

MISSING 

Distinct1868
Distinct (%)76.0%
Missing7541
Missing (%)75.4%
Memory size156.2 KiB
2023-12-12T23:59:22.913592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length3.3594957
Min length1

Characters and Unicode

Total characters8261
Distinct characters560
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1633 ?
Unique (%)66.4%

Sample

1st row안새미오름
2nd row궤네기굴
3rd row알뱅듸
4th row후앙모살
5th row악생이
ValueCountFrequency (%)
상동 36
 
1.5%
서동 33
 
1.3%
동동 30
 
1.2%
하동 30
 
1.2%
중동 29
 
1.2%
본동 15
 
0.6%
포제동산 13
 
0.5%
뒷동산 13
 
0.5%
비석거리 13
 
0.5%
망동산 12
 
0.5%
Other values (1855) 2255
91.0%
2023-12-12T23:59:23.507295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
938
 
11.4%
295
 
3.6%
191
 
2.3%
175
 
2.1%
175
 
2.1%
147
 
1.8%
110
 
1.3%
107
 
1.3%
105
 
1.3%
103
 
1.2%
Other values (550) 5915
71.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8123
98.3%
Other Punctuation 53
 
0.6%
Space Separator 44
 
0.5%
Decimal Number 15
 
0.2%
Close Punctuation 13
 
0.2%
Open Punctuation 13
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
938
 
11.5%
295
 
3.6%
191
 
2.4%
175
 
2.2%
175
 
2.2%
147
 
1.8%
110
 
1.4%
107
 
1.3%
105
 
1.3%
103
 
1.3%
Other values (542) 5777
71.1%
Decimal Number
ValueCountFrequency (%)
1 8
53.3%
2 4
26.7%
0 2
 
13.3%
3 1
 
6.7%
Other Punctuation
ValueCountFrequency (%)
& 53
100.0%
Space Separator
ValueCountFrequency (%)
44
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8088
97.9%
Common 138
 
1.7%
Han 35
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
938
 
11.6%
295
 
3.6%
191
 
2.4%
175
 
2.2%
175
 
2.2%
147
 
1.8%
110
 
1.4%
107
 
1.3%
105
 
1.3%
103
 
1.3%
Other values (515) 5742
71.0%
Han
ValueCountFrequency (%)
6
 
17.1%
3
 
8.6%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (17) 17
48.6%
Common
ValueCountFrequency (%)
& 53
38.4%
44
31.9%
) 13
 
9.4%
( 13
 
9.4%
1 8
 
5.8%
2 4
 
2.9%
0 2
 
1.4%
3 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8088
97.9%
ASCII 138
 
1.7%
CJK 35
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
938
 
11.6%
295
 
3.6%
191
 
2.4%
175
 
2.2%
175
 
2.2%
147
 
1.8%
110
 
1.4%
107
 
1.3%
105
 
1.3%
103
 
1.3%
Other values (515) 5742
71.0%
ASCII
ValueCountFrequency (%)
& 53
38.4%
44
31.9%
) 13
 
9.4%
( 13
 
9.4%
1 8
 
5.8%
2 4
 
2.9%
0 2
 
1.4%
3 1
 
0.7%
CJK
ValueCountFrequency (%)
6
 
17.1%
3
 
8.6%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (17) 17
48.6%
Distinct8272
Distinct (%)83.1%
Missing47
Missing (%)0.5%
Memory size156.2 KiB
2023-12-12T23:59:23.890866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length28
Mean length4.1165478
Min length1

Characters and Unicode

Total characters40972
Distinct characters1017
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7518 ?
Unique (%)75.5%

Sample

1st row성당골목
2nd row보리수나무동산&보리수나무아래
3rd row자금이밭
4th row안샘오름&명도암오르
5th row섯윤서
ValueCountFrequency (%)
55
 
0.5%
상동 36
 
0.3%
하동 34
 
0.3%
서동 32
 
0.3%
중동 31
 
0.3%
동동 26
 
0.2%
길목 25
 
0.2%
25
 
0.2%
동네 23
 
0.2%
포제동산 22
 
0.2%
Other values (8298) 10238
97.1%
2023-12-12T23:59:24.495124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2431
 
5.9%
1790
 
4.4%
1366
 
3.3%
1071
 
2.6%
916
 
2.2%
805
 
2.0%
& 803
 
2.0%
690
 
1.7%
587
 
1.4%
577
 
1.4%
Other values (1007) 29936
73.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39329
96.0%
Other Punctuation 807
 
2.0%
Space Separator 690
 
1.7%
Close Punctuation 65
 
0.2%
Open Punctuation 65
 
0.2%
Decimal Number 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2431
 
6.2%
1790
 
4.6%
1366
 
3.5%
1071
 
2.7%
916
 
2.3%
805
 
2.0%
587
 
1.5%
577
 
1.5%
507
 
1.3%
492
 
1.3%
Other values (992) 28787
73.2%
Decimal Number
ValueCountFrequency (%)
1 7
43.8%
3 4
25.0%
0 2
 
12.5%
2 1
 
6.2%
5 1
 
6.2%
4 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
& 803
99.5%
/ 2
 
0.2%
, 1
 
0.1%
. 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 64
98.5%
] 1
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 64
98.5%
[ 1
 
1.5%
Space Separator
ValueCountFrequency (%)
690
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39175
95.6%
Common 1643
 
4.0%
Han 154
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2431
 
6.2%
1790
 
4.6%
1366
 
3.5%
1071
 
2.7%
916
 
2.3%
805
 
2.1%
587
 
1.5%
577
 
1.5%
507
 
1.3%
492
 
1.3%
Other values (889) 28633
73.1%
Han
ValueCountFrequency (%)
7
 
4.5%
7
 
4.5%
6
 
3.9%
5
 
3.2%
4
 
2.6%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (93) 110
71.4%
Common
ValueCountFrequency (%)
& 803
48.9%
690
42.0%
) 64
 
3.9%
( 64
 
3.9%
1 7
 
0.4%
3 4
 
0.2%
/ 2
 
0.1%
0 2
 
0.1%
2 1
 
0.1%
5 1
 
0.1%
Other values (5) 5
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39175
95.6%
ASCII 1643
 
4.0%
CJK 147
 
0.4%
CJK Compat Ideographs 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2431
 
6.2%
1790
 
4.6%
1366
 
3.5%
1071
 
2.7%
916
 
2.3%
805
 
2.1%
587
 
1.5%
577
 
1.5%
507
 
1.3%
492
 
1.3%
Other values (889) 28633
73.1%
ASCII
ValueCountFrequency (%)
& 803
48.9%
690
42.0%
) 64
 
3.9%
( 64
 
3.9%
1 7
 
0.4%
3 4
 
0.2%
/ 2
 
0.1%
0 2
 
0.1%
2 1
 
0.1%
5 1
 
0.1%
Other values (5) 5
 
0.3%
CJK
ValueCountFrequency (%)
7
 
4.8%
7
 
4.8%
6
 
4.1%
5
 
3.4%
4
 
2.7%
3
 
2.0%
3
 
2.0%
3
 
2.0%
3
 
2.0%
3
 
2.0%
Other values (87) 103
70.1%
CJK Compat Ideographs
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

출처 구분
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
문헌조사
4776 
온라인조사
3744 
현지조사
1480 

Length

Max length5
Median length4
Mean length4.3744
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row문헌조사
2nd row온라인조사
3rd row문헌조사
4th row문헌조사
5th row문헌조사

Common Values

ValueCountFrequency (%)
문헌조사 4776
47.8%
온라인조사 3744
37.4%
현지조사 1480
 
14.8%

Length

2023-12-12T23:59:24.690902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:59:24.827656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문헌조사 4776
47.8%
온라인조사 3744
37.4%
현지조사 1480
 
14.8%

출처
Text

Distinct1324
Distinct (%)13.2%
Missing3
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-12T23:59:24.959229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length125
Median length121
Mean length26.695909
Min length6

Characters and Unicode

Total characters266879
Distinct characters211
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique946 ?
Unique (%)9.5%

Sample

1st row서홍동 향토지
2nd rowhttps://www.jeju.go.kr/vill/joil/intro/history.htm
3rd row수산리 향토지
4th row봉개동 향토지
5th row김녕리 향토지
ValueCountFrequency (%)
향토지 4762
29.3%
경로당 869
 
5.3%
유수암리 303
 
1.9%
수산리 295
 
1.8%
사계리 262
 
1.6%
마을회관 237
 
1.5%
봉개동 203
 
1.2%
사무소 188
 
1.2%
김녕리 179
 
1.1%
평대리 176
 
1.1%
Other values (1277) 8774
54.0%
2023-12-12T23:59:25.282972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 22550
 
8.4%
t 18022
 
6.8%
. 14615
 
5.5%
i 13369
 
5.0%
o 12908
 
4.8%
w 12504
 
4.7%
h 11144
 
4.2%
r 9833
 
3.7%
e 9639
 
3.6%
s 8831
 
3.3%
Other values (201) 133464
50.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 161630
60.6%
Other Punctuation 42838
 
16.1%
Other Letter 38823
 
14.5%
Decimal Number 13746
 
5.2%
Space Separator 6292
 
2.4%
Math Symbol 3377
 
1.3%
Uppercase Letter 62
 
< 0.1%
Close Punctuation 33
 
< 0.1%
Open Punctuation 33
 
< 0.1%
Other Symbol 23
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4926
 
12.7%
4860
 
12.5%
4762
 
12.3%
4019
 
10.4%
2342
 
6.0%
874
 
2.3%
869
 
2.2%
869
 
2.2%
863
 
2.2%
679
 
1.7%
Other values (139) 13760
35.4%
Lowercase Letter
ValueCountFrequency (%)
t 18022
 
11.2%
i 13369
 
8.3%
o 12908
 
8.0%
w 12504
 
7.7%
h 11144
 
6.9%
r 9833
 
6.1%
e 9639
 
6.0%
s 8831
 
5.5%
l 7873
 
4.9%
n 7835
 
4.8%
Other values (15) 49672
30.7%
Uppercase Letter
ValueCountFrequency (%)
I 25
40.3%
C 8
 
12.9%
T 5
 
8.1%
G 5
 
8.1%
D 4
 
6.5%
N 4
 
6.5%
S 3
 
4.8%
H 2
 
3.2%
X 1
 
1.6%
L 1
 
1.6%
Other values (4) 4
 
6.5%
Decimal Number
ValueCountFrequency (%)
1 4990
36.3%
2 2316
16.8%
5 1284
 
9.3%
8 1063
 
7.7%
0 955
 
6.9%
6 725
 
5.3%
3 721
 
5.2%
4 690
 
5.0%
7 596
 
4.3%
9 406
 
3.0%
Other Punctuation
ValueCountFrequency (%)
/ 22550
52.6%
. 14615
34.1%
: 3750
 
8.8%
? 1487
 
3.5%
& 430
 
1.0%
, 6
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 3375
99.9%
+ 2
 
0.1%
Space Separator
ValueCountFrequency (%)
6292
100.0%
Close Punctuation
ValueCountFrequency (%)
) 33
100.0%
Open Punctuation
ValueCountFrequency (%)
( 33
100.0%
Other Symbol
ValueCountFrequency (%)
23
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 161692
60.6%
Common 66364
24.9%
Hangul 38823
 
14.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4926
 
12.7%
4860
 
12.5%
4762
 
12.3%
4019
 
10.4%
2342
 
6.0%
874
 
2.3%
869
 
2.2%
869
 
2.2%
863
 
2.2%
679
 
1.7%
Other values (139) 13760
35.4%
Latin
ValueCountFrequency (%)
t 18022
 
11.1%
i 13369
 
8.3%
o 12908
 
8.0%
w 12504
 
7.7%
h 11144
 
6.9%
r 9833
 
6.1%
e 9639
 
6.0%
s 8831
 
5.5%
l 7873
 
4.9%
n 7835
 
4.8%
Other values (29) 49734
30.8%
Common
ValueCountFrequency (%)
/ 22550
34.0%
. 14615
22.0%
6292
 
9.5%
1 4990
 
7.5%
: 3750
 
5.7%
= 3375
 
5.1%
2 2316
 
3.5%
? 1487
 
2.2%
5 1284
 
1.9%
8 1063
 
1.6%
Other values (13) 4642
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 228033
85.4%
Hangul 38822
 
14.5%
CJK Compat 23
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 22550
 
9.9%
t 18022
 
7.9%
. 14615
 
6.4%
i 13369
 
5.9%
o 12908
 
5.7%
w 12504
 
5.5%
h 11144
 
4.9%
r 9833
 
4.3%
e 9639
 
4.2%
s 8831
 
3.9%
Other values (51) 94618
41.5%
Hangul
ValueCountFrequency (%)
4926
 
12.7%
4860
 
12.5%
4762
 
12.3%
4019
 
10.4%
2342
 
6.0%
874
 
2.3%
869
 
2.2%
869
 
2.2%
863
 
2.2%
679
 
1.7%
Other values (138) 13759
35.4%
CJK Compat
ValueCountFrequency (%)
23
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-12T23:59:25.375384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
장소구분출처 구분
장소구분1.0000.000
출처 구분0.0001.000
2023-12-12T23:59:25.452445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
장소구분출처 구분
장소구분1.0000.000
출처 구분0.0001.000
2023-12-12T23:59:25.518687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
장소구분출처 구분
장소구분1.0000.000
출처 구분0.0001.000

Missing values

2023-12-12T23:59:18.829814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:59:18.971362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:59:19.117453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

장소명장소구분의미소재지현지리지명표준어출처 구분출처
9369성당골목<NA>성당이 있는 곳의 좁은 골목길이다.서홍동<NA>성당골목문헌조사서홍동 향토지
10280볼레낭동산&볼레낭알<NA>볼레낭'은 '보리수나무'의 사투리이다. '볼레낭'이 군락을 이룬 높은 곳이라 해서 '볼레낭동산', 그 밑 바닷가라 해서 '볼레낭알'이다. 안비양 동남쪽 300미터 지점에 있다.조일리<NA>보리수나무동산&보리수나무아래온라인조사https://www.jeju.go.kr/vill/joil/intro/history.htm
5232자금이왓<NA>애월읍 수산리 300번지 일대의 밭 이름이다.수산리<NA>자금이밭문헌조사수산리 향토지
8049안새미오름&명도암오름<NA>봉개동 산2번지 일대의 오름으로 조리새미라는 샘물을 중심으로 마을과 가까운 오름을 안새미, 먼 곳을 밧새미라고 부른다. 두 오름을 합쳐 형제봉이라고도 부른다. 제주군읍지에는 형봉(兄峰)이라고 하였다. 밧새미(弟峰)보다 5m가량 높다. 이 오름 북사면에 명도암선생유허비가 있다. 일제강점기에 판 갱도진지가 있으며, 이 갱도진지에서는 4·3 당시 명도암 마을 주민들 일부와 용강동 주민도 피난생활을 하기도 했다.봉개동안새미오름안샘오름&명도암오르문헌조사봉개동 향토지
979섯윤서<NA>구좌읍 김녕리 일대의 지명이다.김녕리<NA>섯윤서문헌조사김녕리 향토지
4319기러기왓&기레기왓<NA>이 지역의 지형 지세가 마치 기러기가 앉아있는 형국이라 하여 붙은 이름이다.세화리<NA>기러기밭온라인조사https://www.jeju.go.kr/vill/sehwa3/index.htm
1805복카이&복카이내<NA>이 지역은 ‘복카이네’라는 내(川)가 있고 일대에는 협소한 경작지가 있어 이를 ‘복카이’라고 부른다.신흥리<NA>복카이온라인조사https://www.jeju.go.kr/vill/sinheung1/intro/history.htm?act=view/seq=1128677
5974도로쇄동산<NA>애월읍 유수암리 산43번지 일대이다.유수암리<NA>도로쇄동산문헌조사유수암리 향토지
4678알마장<NA>속칭 '알검댕일 터' 일대를 말하며 일주도로 변에 접한 지형은 매봉과 연결된다. 초지여서 우마의 방목지로 많이 이용되었으나 현재는 소나무밭이 되어 있다.표선리<NA>알마장온라인조사http://www.jeju.go.kr/vill/pyoseon/intro/history.htm
2903거스른물<NA>안덕면 사계리 209번지 일대의 물 이름이다.사계리<NA>거스른물문헌조사사계리 향토지
장소명장소구분의미소재지현지리지명표준어출처 구분출처
2463쇠목동산<NA>언덕의 형세가 소의 목(쇠목)처럼 생겼다 하여 붙은 이름이다. ‘푸렁머채’ 북쪽에 위치한다.삼달리<NA>쇠목동산온라인조사https://www.jeju.go.kr/vill/simdal1/intro/history.htm
6505고상이빌래<NA>현 동남원 위쪽 서남쪽으로 고독한 과부가 살았다는 유래가 있다.옹포리<NA>고상이반석온라인조사http://www.jeju.go.kr/vill/ongpo/intro/history.htm?act=view&seq=1125578
3583알엉밭&암전(岩田)<NA>인조조 시대(仁朝朝 時代)에 이더리(이교동)에 정착하였던 그 당시에는 이곳이 돌이 많이 쌓였기 때문에 제주 사투리로 엉덕이라 했다. 그 주위에는 밭이 있었으므로 돌무덤 아래 밭이 있다는 뜻으로 붙은 이름이다.상모리<NA>아래바위밭&암전온라인조사https://www.jeju.go.kr/vill/sangmo1/intro/history.htm?act=view/seq=1127808
7485머구남밧<NA>머구남'은 '머귀나무'의 제주 방언이다.영평동<NA>머귀나무밭문헌조사영평동 향토지
8819칠성골자연부락명칠성골은 산지목골에서 관덕정 광장까지의 길로 조선시대 중심 도로였다. 그래서 일제시대 상가는 읍성 중심부인 칠성통(본정통·本町通)과 관덕로(원정통·元町通), 남문 한짓골, 서문한질 일대에 자리 잡았다. 옛 제주인들은 북두칠성(北斗七星)을 숭배했다. 고·양·부 삼신인(三神人)이 각각 일도, 이도, 삼도로 나누어 차지한 후, 북두칠성 모양을 본떠 대(臺)를 쌓아 마을을 이뤘다. 칠성대는 제주 성안 7곳에 북두칠성 모양으로 흩어져 있었다. 7곳의 칠성대 중 3곳이 있었던 데서 칠성골이라 부르게 됐다.이도1동<NA>칠성골문헌조사이도1동 향토지
9205잔목동산<NA>옛날 이 지역이 목장이었을 때 목장 안으로 들어가는 진입로였다는 뜻으로 '잔목동산'이라 부른다.동홍동<NA>잔목동산온라인조사https://www.jeju.go.kr/vill/donghong/intro/history.htm
6799늬커리<NA>협재리 1730번지 앞에 위치한 오거리인데도 '늬커리(사지동, 四枝洞)', 즉 '사거리'라고 부르고 있다.협재리<NA>네거리온라인조사https://www.jeju.go.kr/vill/hyeopjae/intro/history.htm?act=view&seq=1125584
4326덕우내<NA>홍서물 서쪽에 있으며, 식수로 이용하였다.세화리<NA>덕우내현지조사세화3리 경로당
9965논지물&논짓물<NA>해변 가까이 있는 논에서 나는 물이라 하여 '논짓물'이라 부르나, 바다와 너무 가까이에서 물이 솟아나 바로 바다로 흘러가 버리기 때문에 식수나 농업용수로 사용할 수가 없고 그냥 버린다고 하여 쓸데없는 물이라는 의미로 '논짓물'이라 한다.하예동논지물논에서 나는 물온라인조사https://www.jeju.go.kr/vill/haye1/intro/history.htm
7681개자치<NA>지형이 마치 개처럼 생겼다 해서 명명된 지명이다. '치'는 언덕이나 장소를 뜻한다.아라동<NA>개언덕문헌조사아라동 향토지