Overview

Dataset statistics

Number of variables5
Number of observations159
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.3 KiB
Average record size in memory40.8 B

Variable types

Text3
Categorical2

Dataset

Description메인 키,명칭,행정 시,행정 구,행정 동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13082/S/1/datasetView.do

Alerts

행정 시 has constant value ""Constant
메인 키 has unique valuesUnique

Reproduction

Analysis started2024-04-14 03:46:52.526380
Analysis finished2024-04-14 03:46:55.275141
Duration2.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

메인 키
Text

UNIQUE 

Distinct159
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-14T12:46:55.997499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters2226
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique159 ?
Unique (%)100.0%

Sample

1st rowBE_LiST37-0001
2nd rowBE_LiST37-0002
3rd rowBE_LiST37-0003
4th rowBE_LiST37-0004
5th rowBE_LiST37-0005
ValueCountFrequency (%)
be_list37-0001 1
 
0.6%
be_list37-0101 1
 
0.6%
be_list37-0103 1
 
0.6%
be_list37-0104 1
 
0.6%
be_list37-0105 1
 
0.6%
be_list37-0106 1
 
0.6%
be_list37-0107 1
 
0.6%
be_list37-0108 1
 
0.6%
be_list37-0109 1
 
0.6%
be_list37-0102 1
 
0.6%
Other values (149) 149
93.7%
2024-04-14T12:46:57.281152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 292
13.1%
3 195
8.8%
7 185
8.3%
B 159
 
7.1%
T 159
 
7.1%
E 159
 
7.1%
- 159
 
7.1%
S 159
 
7.1%
i 159
 
7.1%
L 159
 
7.1%
Other values (8) 441
19.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 954
42.9%
Uppercase Letter 795
35.7%
Dash Punctuation 159
 
7.1%
Lowercase Letter 159
 
7.1%
Connector Punctuation 159
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 292
30.6%
3 195
20.4%
7 185
19.4%
1 96
 
10.1%
4 36
 
3.8%
5 36
 
3.8%
2 36
 
3.8%
6 26
 
2.7%
8 26
 
2.7%
9 26
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
B 159
20.0%
T 159
20.0%
E 159
20.0%
S 159
20.0%
L 159
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 159
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 159
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1272
57.1%
Latin 954
42.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 292
23.0%
3 195
15.3%
7 185
14.5%
- 159
12.5%
_ 159
12.5%
1 96
 
7.5%
4 36
 
2.8%
5 36
 
2.8%
2 36
 
2.8%
6 26
 
2.0%
Other values (2) 52
 
4.1%
Latin
ValueCountFrequency (%)
B 159
16.7%
T 159
16.7%
E 159
16.7%
S 159
16.7%
i 159
16.7%
L 159
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2226
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 292
13.1%
3 195
8.8%
7 185
8.3%
B 159
 
7.1%
T 159
 
7.1%
E 159
 
7.1%
- 159
 
7.1%
S 159
 
7.1%
i 159
 
7.1%
L 159
 
7.1%
Other values (8) 441
19.8%

명칭
Text

Distinct152
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-14T12:46:58.423927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length30
Mean length15.075472
Min length3

Characters and Unicode

Total characters2397
Distinct characters57
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)92.5%

Sample

1st rowJW Marriott Seoul
2nd rowMEASEOUL
3rd rowW Seoul Walkerhill
4th rowGrand Ambassador Seoul
5th rowGrand Intercontinental Seoul Parnas
ValueCountFrequency (%)
hotel 110
27.4%
seoul 28
 
7.0%
hotels 17
 
4.2%
tourist 11
 
2.7%
the 7
 
1.7%
new 5
 
1.2%
lotte 4
 
1.0%
grand 4
 
1.0%
ambassador 4
 
1.0%
western 3
 
0.7%
Other values (178) 209
52.0%
2024-04-14T12:47:00.016021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 272
11.3%
o 266
11.1%
243
 
10.1%
l 213
 
8.9%
t 203
 
8.5%
H 137
 
5.7%
a 130
 
5.4%
i 100
 
4.2%
n 100
 
4.2%
r 83
 
3.5%
Other values (47) 650
27.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1721
71.8%
Uppercase Letter 422
 
17.6%
Space Separator 243
 
10.1%
Other Punctuation 5
 
0.2%
Decimal Number 3
 
0.1%
Dash Punctuation 2
 
0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 272
15.8%
o 266
15.5%
l 213
12.4%
t 203
11.8%
a 130
7.6%
i 100
 
5.8%
n 100
 
5.8%
r 83
 
4.8%
s 70
 
4.1%
u 60
 
3.5%
Other values (16) 224
13.0%
Uppercase Letter
ValueCountFrequency (%)
H 137
32.5%
S 49
 
11.6%
T 32
 
7.6%
C 21
 
5.0%
G 20
 
4.7%
P 17
 
4.0%
N 15
 
3.6%
R 15
 
3.6%
B 14
 
3.3%
M 14
 
3.3%
Other values (14) 88
20.9%
Other Punctuation
ValueCountFrequency (%)
& 3
60.0%
. 2
40.0%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
2 1
33.3%
Space Separator
ValueCountFrequency (%)
243
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2143
89.4%
Common 253
 
10.6%
Hangul 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 272
12.7%
o 266
12.4%
l 213
 
9.9%
t 203
 
9.5%
H 137
 
6.4%
a 130
 
6.1%
i 100
 
4.7%
n 100
 
4.7%
r 83
 
3.9%
s 70
 
3.3%
Other values (40) 569
26.6%
Common
ValueCountFrequency (%)
243
96.0%
& 3
 
1.2%
- 2
 
0.8%
. 2
 
0.8%
1 2
 
0.8%
2 1
 
0.4%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2396
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 272
11.4%
o 266
11.1%
243
 
10.1%
l 213
 
8.9%
t 203
 
8.5%
H 137
 
5.7%
a 130
 
5.4%
i 100
 
4.2%
n 100
 
4.2%
r 83
 
3.5%
Other values (46) 649
27.1%
Hangul
ValueCountFrequency (%)
1
100.0%

행정 시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
Seoul
159 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeoul
2nd rowSeoul
3rd rowSeoul
4th rowSeoul
5th rowSeoul

Common Values

ValueCountFrequency (%)
Seoul 159
100.0%

Length

2024-04-14T12:47:00.427404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T12:47:00.745239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
seoul 159
100.0%

행정 구
Categorical

Distinct23
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
Gangnam-gu
39 
Jung-gu
27 
Gangseo-gu
11 
Seocho-gu
10 
Songpa-gu
10 
Other values (18)
62 

Length

Max length15
Median length13
Mean length9.5974843
Min length7

Unique

Unique5 ?
Unique (%)3.1%

Sample

1st rowSeocho-gu
2nd rowYeongdeungpo-gu
3rd rowGwangjin-gu
4th rowJung-gu
5th rowGangnam-gu

Common Values

ValueCountFrequency (%)
Gangnam-gu 39
24.5%
Jung-gu 27
17.0%
Gangseo-gu 11
 
6.9%
Seocho-gu 10
 
6.3%
Songpa-gu 10
 
6.3%
Yongsan-gu 9
 
5.7%
Yeongdeungpo-gu 8
 
5.0%
Mapo-gu 8
 
5.0%
Jongno-gu 7
 
4.4%
Gwangjin-gu 4
 
2.5%
Other values (13) 26
16.4%

Length

2024-04-14T12:47:01.116795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gangnam-gu 39
24.5%
jung-gu 27
17.0%
gangseo-gu 11
 
6.9%
seocho-gu 10
 
6.3%
songpa-gu 10
 
6.3%
yongsan-gu 9
 
5.7%
yeongdeungpo-gu 8
 
5.0%
mapo-gu 8
 
5.0%
jongno-gu 7
 
4.4%
gwangjin-gu 4
 
2.5%
Other values (13) 26
16.4%
Distinct78
Distinct (%)49.1%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-14T12:47:01.925061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length12.855346
Min length8

Characters and Unicode

Total characters2044
Distinct characters44
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)28.3%

Sample

1st rowBanpo4-dong
2nd rowYeoui-dong
3rd rowGwangjang-dong
4th rowJangchung-dong
5th rowSamseong1-dong
ValueCountFrequency (%)
yeoksam1-dong 12
 
7.5%
myeong-dong 9
 
5.7%
jongno1.2.3.4ga-dong 6
 
3.8%
yeoui-dong 5
 
3.1%
nonhyeon1-dong 5
 
3.1%
samseong2-dong 5
 
3.1%
cheongdam-dong 4
 
2.5%
sogong-dong 4
 
2.5%
jangchung-dong 4
 
2.5%
bangi2-dong 4
 
2.5%
Other values (68) 101
63.5%
2024-04-14T12:47:03.155581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 324
15.9%
o 310
15.2%
g 282
13.8%
d 168
 
8.2%
- 159
 
7.8%
a 114
 
5.6%
e 98
 
4.8%
m 46
 
2.3%
h 41
 
2.0%
1 35
 
1.7%
Other values (34) 467
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1613
78.9%
Dash Punctuation 159
 
7.8%
Uppercase Letter 159
 
7.8%
Decimal Number 93
 
4.5%
Other Punctuation 20
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 324
20.1%
o 310
19.2%
g 282
17.5%
d 168
10.4%
a 114
 
7.1%
e 98
 
6.1%
m 46
 
2.9%
h 41
 
2.5%
s 34
 
2.1%
u 32
 
2.0%
Other values (11) 164
10.2%
Uppercase Letter
ValueCountFrequency (%)
S 31
19.5%
Y 27
17.0%
J 18
11.3%
H 14
8.8%
G 12
 
7.5%
B 11
 
6.9%
N 10
 
6.3%
M 10
 
6.3%
D 9
 
5.7%
C 7
 
4.4%
Other values (5) 10
 
6.3%
Decimal Number
ValueCountFrequency (%)
1 35
37.6%
2 28
30.1%
4 12
 
12.9%
3 11
 
11.8%
6 5
 
5.4%
7 2
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 159
100.0%
Other Punctuation
ValueCountFrequency (%)
. 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1772
86.7%
Common 272
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 324
18.3%
o 310
17.5%
g 282
15.9%
d 168
9.5%
a 114
 
6.4%
e 98
 
5.5%
m 46
 
2.6%
h 41
 
2.3%
s 34
 
1.9%
u 32
 
1.8%
Other values (26) 323
18.2%
Common
ValueCountFrequency (%)
- 159
58.5%
1 35
 
12.9%
2 28
 
10.3%
. 20
 
7.4%
4 12
 
4.4%
3 11
 
4.0%
6 5
 
1.8%
7 2
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2044
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 324
15.9%
o 310
15.2%
g 282
13.8%
d 168
 
8.2%
- 159
 
7.8%
a 114
 
5.6%
e 98
 
4.8%
m 46
 
2.3%
h 41
 
2.0%
1 35
 
1.7%
Other values (34) 467
22.8%

Correlations

2024-04-14T12:47:03.414281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정 구행정 동
행정 구1.0001.000
행정 동1.0001.000

Missing values

2024-04-14T12:46:54.810899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-14T12:46:55.139391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

메인 키명칭행정 시행정 구행정 동
0BE_LiST37-0001JW Marriott SeoulSeoulSeocho-guBanpo4-dong
1BE_LiST37-0002MEASEOULSeoulYeongdeungpo-guYeoui-dong
2BE_LiST37-0003W Seoul WalkerhillSeoulGwangjin-guGwangjang-dong
3BE_LiST37-0004Grand Ambassador SeoulSeoulJung-guJangchung-dong
4BE_LiST37-0005Grand Intercontinental Seoul ParnasSeoulGangnam-guSamseong1-dong
5BE_LiST37-0006Grand Hyatt SeoulSeoulYongsan-guHannam-dong
6BE_LiST37-0007Grand Hilton SeoulSeoulSeodaemun-guHongeun2-dong
7BE_LiST37-0008Novotal Ambassador Seoul GangnamSeoulGangnam-guYeoksam1-dong
8BE_LiST37-0009Lotte City Hotel Gimpo AirportSeoulGangseo-guBanghwa2-dong
9BE_LiST37-0010Lotte HotelSeoulJung-guMyeong-dong
메인 키명칭행정 시행정 구행정 동
149BE_LiST37-0150Hotel La CasaSeoulGangnam-guSinsa-dong
150BE_LiST37-0151Hotel MindSeoulGeumcheon-guGasan-dong
151BE_LiST37-0152Hotel L & SSeoulJungnang-guMangubon-dong
152BE_LiST37-0153Hotel KPSeoulDongdaemun-guHwigyeong1-dong
153BE_LiST37-0154Hotel TiffanySeoulGangnam-guCheongdam-dong
154BE_LiST37-0155CF HotelSeoulSongpa-guBangi2-dong
155BE_LiST37-0156Gallery HotelSeoulEunpyeong-guNokbeon-dong
156BE_LiST37-0157Seoul Partners HouseSeoulYongsan-guHannam-dong
157BE_LiST37-0158Saerim HotelSeoulJongno-guJongno1.2.3.4ga-dong
158BE_LiST37-0159Jelly HotelSeoulGangnam-guYeoksam1-dong