Overview

Dataset statistics

Number of variables5
Number of observations340
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.4 KiB
Average record size in memory40.4 B

Variable types

Text3
Categorical2

Dataset

Description키,명칭,행정시,행정구,행정동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13058/S/1/datasetView.do

Alerts

행정시 has constant value ""Constant
has unique valuesUnique

Reproduction

Analysis started2023-12-11 06:01:27.107722
Analysis finished2023-12-11 06:01:27.475008
Duration0.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct340
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-11T15:01:27.712480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters4080
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique340 ?
Unique (%)100.0%

Sample

1st rowBE_IW18-0238
2nd rowBE_IW18-0239
3rd rowBE_IW18-0240
4th rowBE_IW18-0241
5th rowBE_IW18-0242
ValueCountFrequency (%)
be_iw18-0238 1
 
0.3%
be_iw18-0121 1
 
0.3%
be_iw18-0130 1
 
0.3%
be_iw18-0129 1
 
0.3%
be_iw18-0128 1
 
0.3%
be_iw18-0127 1
 
0.3%
be_iw18-0126 1
 
0.3%
be_iw18-0125 1
 
0.3%
be_iw18-0124 1
 
0.3%
be_iw18-0123 1
 
0.3%
Other values (330) 330
97.1%
2023-12-11T15:01:28.529602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 514
12.6%
0 512
12.5%
8 404
9.9%
B 340
8.3%
E 340
8.3%
_ 340
8.3%
I 340
8.3%
W 340
8.3%
- 340
8.3%
2 174
 
4.3%
Other values (6) 436
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2040
50.0%
Uppercase Letter 1360
33.3%
Connector Punctuation 340
 
8.3%
Dash Punctuation 340
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 514
25.2%
0 512
25.1%
8 404
19.8%
2 174
 
8.5%
3 115
 
5.6%
4 65
 
3.2%
5 64
 
3.1%
6 64
 
3.1%
7 64
 
3.1%
9 64
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
B 340
25.0%
E 340
25.0%
I 340
25.0%
W 340
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 340
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 340
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2720
66.7%
Latin 1360
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 514
18.9%
0 512
18.8%
8 404
14.9%
_ 340
12.5%
- 340
12.5%
2 174
 
6.4%
3 115
 
4.2%
4 65
 
2.4%
5 64
 
2.4%
6 64
 
2.4%
Other values (2) 128
 
4.7%
Latin
ValueCountFrequency (%)
B 340
25.0%
E 340
25.0%
I 340
25.0%
W 340
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 514
12.6%
0 512
12.5%
8 404
9.9%
B 340
8.3%
E 340
8.3%
_ 340
8.3%
I 340
8.3%
W 340
8.3%
- 340
8.3%
2 174
 
4.3%
Other values (6) 436
10.7%

명칭
Text

Distinct321
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-11T15:01:28.880223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length11.047059
Min length1

Characters and Unicode

Total characters3756
Distinct characters59
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique305 ?
Unique (%)89.7%

Sample

1st rowSinyeong Motel
2nd rowSiniljang
3rd rowSantafe Motel
4th rowAroma
5th rowAmor
ValueCountFrequency (%)
motel 144
 
24.2%
hotel 33
 
5.5%
yeoinsuk 10
 
1.7%
park 7
 
1.2%
theme 7
 
1.2%
inn 7
 
1.2%
cinema 5
 
0.8%
house 5
 
0.8%
m 4
 
0.7%
sky 4
 
0.7%
Other values (333) 369
62.0%
2023-12-11T15:01:29.390011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 409
 
10.9%
o 392
 
10.4%
n 301
 
8.0%
a 273
 
7.3%
272
 
7.2%
l 267
 
7.1%
t 220
 
5.9%
g 173
 
4.6%
M 172
 
4.6%
i 139
 
3.7%
Other values (49) 1138
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2863
76.2%
Uppercase Letter 605
 
16.1%
Space Separator 272
 
7.2%
Decimal Number 9
 
0.2%
Dash Punctuation 3
 
0.1%
Other Punctuation 1
 
< 0.1%
Modifier Symbol 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 409
14.3%
o 392
13.7%
n 301
10.5%
a 273
9.5%
l 267
9.3%
t 220
7.7%
g 173
 
6.0%
i 139
 
4.9%
u 96
 
3.4%
s 82
 
2.9%
Other values (15) 511
17.8%
Uppercase Letter
ValueCountFrequency (%)
M 172
28.4%
H 70
11.6%
S 56
 
9.3%
C 37
 
6.1%
G 35
 
5.8%
P 28
 
4.6%
T 25
 
4.1%
R 22
 
3.6%
Y 20
 
3.3%
D 18
 
3.0%
Other values (15) 122
20.2%
Decimal Number
ValueCountFrequency (%)
2 5
55.6%
1 3
33.3%
5 1
 
11.1%
Space Separator
ValueCountFrequency (%)
272
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Other Punctuation
ValueCountFrequency (%)
' 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3468
92.3%
Common 288
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 409
 
11.8%
o 392
 
11.3%
n 301
 
8.7%
a 273
 
7.9%
l 267
 
7.7%
t 220
 
6.3%
g 173
 
5.0%
M 172
 
5.0%
i 139
 
4.0%
u 96
 
2.8%
Other values (40) 1026
29.6%
Common
ValueCountFrequency (%)
272
94.4%
2 5
 
1.7%
- 3
 
1.0%
1 3
 
1.0%
' 1
 
0.3%
5 1
 
0.3%
` 1
 
0.3%
( 1
 
0.3%
) 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 409
 
10.9%
o 392
 
10.4%
n 301
 
8.0%
a 273
 
7.3%
272
 
7.2%
l 267
 
7.1%
t 220
 
5.9%
g 173
 
4.6%
M 172
 
4.6%
i 139
 
3.7%
Other values (49) 1138
30.3%

행정시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
Seoul
340 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeoul
2nd rowSeoul
3rd rowSeoul
4th rowSeoul
5th rowSeoul

Common Values

ValueCountFrequency (%)
Seoul 340
100.0%

Length

2023-12-11T15:01:29.571204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:01:29.702078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
seoul 340
100.0%

행정구
Categorical

Distinct25
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
Gwanak-gu
32 
Gangseo-gu
28 
Jongno-gu
22 
Songpa-gu
 
21
Yeongdeungpo-gu
 
20
Other values (20)
217 

Length

Max length15
Median length12
Mean length10.255882
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYeongdeungpo-gu
2nd rowGangbuk-gu
3rd rowGwangjin-gu
4th rowJongno-gu
5th rowDongdaemun-gu

Common Values

ValueCountFrequency (%)
Gwanak-gu 32
 
9.4%
Gangseo-gu 28
 
8.2%
Jongno-gu 22
 
6.5%
Songpa-gu 21
 
6.2%
Yeongdeungpo-gu 20
 
5.9%
Gwangjin-gu 20
 
5.9%
Jungnang-gu 19
 
5.6%
Gangnam-gu 19
 
5.6%
Guro-gu 18
 
5.3%
Gangbuk-gu 17
 
5.0%
Other values (15) 124
36.5%

Length

2023-12-11T15:01:29.851011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gwanak-gu 32
 
9.4%
gangseo-gu 28
 
8.2%
jongno-gu 22
 
6.5%
songpa-gu 21
 
6.2%
yeongdeungpo-gu 20
 
5.9%
gwangjin-gu 20
 
5.9%
jungnang-gu 19
 
5.6%
gangnam-gu 19
 
5.6%
guro-gu 18
 
5.3%
gangbuk-gu 17
 
5.0%
Other values (15) 124
36.5%
Distinct133
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-11T15:01:30.122000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length12.726471
Min length7

Characters and Unicode

Total characters4327
Distinct characters47
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)17.1%

Sample

1st rowYeongdeungpo-dong
2nd rowSongcheon-dong
3rd rowHwayang-dong
4th rowSajik-dong
5th rowImun1-dong
ValueCountFrequency (%)
bangi2-dong 15
 
4.4%
hwagok1-dong 15
 
4.4%
sillim-dong 14
 
4.1%
sinchon-dong 14
 
4.1%
jongno1.2.3.4ga-dong 12
 
3.5%
yeongdeungpo-dong 9
 
2.6%
cheongnyong-dong 8
 
2.4%
yeoksam1-dong 8
 
2.4%
hwayang-dong 7
 
2.1%
sangbong2-dong 7
 
2.1%
Other values (123) 231
67.9%
2023-12-11T15:01:30.569195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 701
16.2%
o 620
14.3%
g 601
13.9%
d 367
 
8.5%
- 340
 
7.9%
a 239
 
5.5%
e 140
 
3.2%
i 120
 
2.8%
S 85
 
2.0%
u 84
 
1.9%
Other values (37) 1030
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3385
78.2%
Dash Punctuation 340
 
7.9%
Uppercase Letter 340
 
7.9%
Decimal Number 221
 
5.1%
Other Punctuation 41
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 701
20.7%
o 620
18.3%
g 601
17.8%
d 367
10.8%
a 239
 
7.1%
e 140
 
4.1%
i 120
 
3.5%
u 84
 
2.5%
m 68
 
2.0%
h 65
 
1.9%
Other values (11) 380
11.2%
Uppercase Letter
ValueCountFrequency (%)
S 85
25.0%
H 37
10.9%
J 36
10.6%
G 35
10.3%
Y 30
 
8.8%
B 28
 
8.2%
D 25
 
7.4%
N 16
 
4.7%
C 14
 
4.1%
M 13
 
3.8%
Other values (7) 21
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 78
35.3%
2 77
34.8%
3 25
 
11.3%
4 24
 
10.9%
6 8
 
3.6%
5 5
 
2.3%
7 4
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
- 340
100.0%
Other Punctuation
ValueCountFrequency (%)
. 41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3725
86.1%
Common 602
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 701
18.8%
o 620
16.6%
g 601
16.1%
d 367
9.9%
a 239
 
6.4%
e 140
 
3.8%
i 120
 
3.2%
S 85
 
2.3%
u 84
 
2.3%
m 68
 
1.8%
Other values (28) 700
18.8%
Common
ValueCountFrequency (%)
- 340
56.5%
1 78
 
13.0%
2 77
 
12.8%
. 41
 
6.8%
3 25
 
4.2%
4 24
 
4.0%
6 8
 
1.3%
5 5
 
0.8%
7 4
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4327
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 701
16.2%
o 620
14.3%
g 601
13.9%
d 367
 
8.5%
- 340
 
7.9%
a 239
 
5.5%
e 140
 
3.2%
i 120
 
2.8%
S 85
 
2.0%
u 84
 
1.9%
Other values (37) 1030
23.8%

Missing values

2023-12-11T15:01:27.299935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:01:27.423957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

명칭행정시행정구행정동
0BE_IW18-0238Sinyeong MotelSeoulYeongdeungpo-guYeongdeungpo-dong
1BE_IW18-0239SiniljangSeoulGangbuk-guSongcheon-dong
2BE_IW18-0240Santafe MotelSeoulGwangjin-guHwayang-dong
3BE_IW18-0241AromaSeoulJongno-guSajik-dong
4BE_IW18-0242AmorSeoulDongdaemun-guImun1-dong
5BE_IW18-0243Amigos HotelSeoulGangseo-guGonghang-dong
6BE_IW18-0244Absong MotelSeoulGwanak-guCheongnyong-dong
7BE_IW18-0245AteneSeoulJungnang-guMangubon-dong
8BE_IW18-0246Alpss SanjangSeoulGangseo-guHwagok6-dong
9BE_IW18-0247Undukyue HayanzibSeoulJung-guHoehyeon-dong
명칭행정시행정구행정동
330BE_IW18-0228Cherevel HotelSeoulSongpa-guBangi2-dong
331BE_IW18-0229SuisseSeoulGwanak-guSillim-dong
332BE_IW18-0230StellaSeoulGangnam-guYeoksam2-dong
333BE_IW18-0231Story twoSeoulGuro-guGuro5-dong
334BE_IW18-0232Cinema MotelSeoulGangseo-guHwagok6-dong
335BE_IW18-0233Seattle HotelSeoulGangnam-guYeoksam1-dong
336BE_IW18-0234CF HotelSeoulSeodaemun-guNamgajwa1-dong
337BE_IW18-0235Sinna MotelSeoulJungnang-guJunghwa2-dong
338BE_IW18-0236Sillajang InnSeoulYeongdeungpo-guSingil1-dong
339BE_IW18-0237SinyeongSeoulJongno-guBuam-dong