Overview

Dataset statistics

Number of variables5
Number of observations147
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.9 KiB
Average record size in memory40.9 B

Variable types

Text3
Categorical2

Dataset

Description키,명칭,행정시,행정구,행정동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13051/S/1/datasetView.do

Alerts

행정시 has constant value ""Constant
has unique valuesUnique

Reproduction

Analysis started2023-12-11 08:41:10.819860
Analysis finished2023-12-11 08:41:11.325341
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct147
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-11T17:41:11.556266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1764
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)100.0%

Sample

1st rowBE_IW17-0001
2nd rowBE_IW17-0002
3rd rowBE_IW17-0003
4th rowBE_IW17-0004
5th rowBE_IW17-0005
ValueCountFrequency (%)
be_iw17-0001 1
 
0.7%
be_iw17-0075 1
 
0.7%
be_iw17-0101 1
 
0.7%
be_iw17-0095 1
 
0.7%
be_iw17-0096 1
 
0.7%
be_iw17-0097 1
 
0.7%
be_iw17-0098 1
 
0.7%
be_iw17-0099 1
 
0.7%
be_iw17-0100 1
 
0.7%
be_iw17-0102 1
 
0.7%
Other values (137) 137
93.2%
2023-12-11T17:41:12.031084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 279
15.8%
1 230
13.0%
7 172
9.8%
B 147
8.3%
E 147
8.3%
_ 147
8.3%
I 147
8.3%
W 147
8.3%
- 147
8.3%
3 35
 
2.0%
Other values (6) 166
9.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 882
50.0%
Uppercase Letter 588
33.3%
Connector Punctuation 147
 
8.3%
Dash Punctuation 147
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 279
31.6%
1 230
26.1%
7 172
19.5%
3 35
 
4.0%
2 35
 
4.0%
4 33
 
3.7%
5 25
 
2.8%
6 25
 
2.8%
8 24
 
2.7%
9 24
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
B 147
25.0%
E 147
25.0%
I 147
25.0%
W 147
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1176
66.7%
Latin 588
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 279
23.7%
1 230
19.6%
7 172
14.6%
_ 147
12.5%
- 147
12.5%
3 35
 
3.0%
2 35
 
3.0%
4 33
 
2.8%
5 25
 
2.1%
6 25
 
2.1%
Other values (2) 48
 
4.1%
Latin
ValueCountFrequency (%)
B 147
25.0%
E 147
25.0%
I 147
25.0%
W 147
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1764
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 279
15.8%
1 230
13.0%
7 172
9.8%
B 147
8.3%
E 147
8.3%
_ 147
8.3%
I 147
8.3%
W 147
8.3%
- 147
8.3%
3 35
 
2.0%
Other values (6) 166
9.4%

명칭
Text

Distinct55
Distinct (%)37.4%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-11T17:41:12.380549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length27
Mean length13.693878
Min length6

Characters and Unicode

Total characters2013
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)28.6%

Sample

1st rowOK Discount Mart
2nd rowNew Mart
3rd rowSupermarket
4th rowDoyeong Discount Mart
5th rowDongwon Discount Mart
ValueCountFrequency (%)
e-mart 31
 
9.5%
store 31
 
9.5%
mart 28
 
8.6%
department 24
 
7.3%
lotte 21
 
6.4%
plus 19
 
5.8%
home 19
 
5.8%
outlet 12
 
3.7%
hanaro 12
 
3.7%
hyundai 8
 
2.4%
Other values (71) 122
37.3%
2023-12-11T17:41:12.940962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 229
 
11.4%
e 202
 
10.0%
180
 
8.9%
a 158
 
7.8%
r 153
 
7.6%
o 136
 
6.8%
n 95
 
4.7%
m 81
 
4.0%
u 64
 
3.2%
S 58
 
2.9%
Other values (43) 657
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1402
69.6%
Uppercase Letter 355
 
17.6%
Space Separator 180
 
8.9%
Dash Punctuation 33
 
1.6%
Decimal Number 30
 
1.5%
Other Punctuation 11
 
0.5%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 229
16.3%
e 202
14.4%
a 158
11.3%
r 153
10.9%
o 136
9.7%
n 95
6.8%
m 81
 
5.8%
u 64
 
4.6%
l 49
 
3.5%
p 43
 
3.1%
Other values (14) 192
13.7%
Uppercase Letter
ValueCountFrequency (%)
S 58
16.3%
H 48
13.5%
D 44
12.4%
M 40
11.3%
E 31
8.7%
L 23
 
6.5%
O 23
 
6.5%
P 22
 
6.2%
C 19
 
5.4%
N 11
 
3.1%
Other values (10) 36
10.1%
Decimal Number
ValueCountFrequency (%)
0 14
46.7%
1 8
26.7%
2 7
23.3%
5 1
 
3.3%
Space Separator
ValueCountFrequency (%)
180
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 33
100.0%
Other Punctuation
ValueCountFrequency (%)
. 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1757
87.3%
Common 256
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 229
13.0%
e 202
 
11.5%
a 158
 
9.0%
r 153
 
8.7%
o 136
 
7.7%
n 95
 
5.4%
m 81
 
4.6%
u 64
 
3.6%
S 58
 
3.3%
l 49
 
2.8%
Other values (34) 532
30.3%
Common
ValueCountFrequency (%)
180
70.3%
- 33
 
12.9%
0 14
 
5.5%
. 11
 
4.3%
1 8
 
3.1%
2 7
 
2.7%
5 1
 
0.4%
) 1
 
0.4%
( 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2013
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 229
 
11.4%
e 202
 
10.0%
180
 
8.9%
a 158
 
7.8%
r 153
 
7.6%
o 136
 
6.8%
n 95
 
4.7%
m 81
 
4.0%
u 64
 
3.2%
S 58
 
2.9%
Other values (43) 657
32.6%

행정시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Seoul
147 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeoul
2nd rowSeoul
3rd rowSeoul
4th rowSeoul
5th rowSeoul

Common Values

ValueCountFrequency (%)
Seoul 147
100.0%

Length

2023-12-11T17:41:13.123427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T17:41:13.270983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
seoul 147
100.0%

행정구
Categorical

Distinct25
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Yeongdeungpo-gu
15 
Yangcheon-gu
10 
Nowon-gu
 
9
Gangseo-gu
 
8
Gangnam-gu
 
8
Other values (20)
97 

Length

Max length15
Median length12
Mean length10.537415
Min length7

Unique

Unique1 ?
Unique (%)0.7%

Sample

1st rowYangcheon-gu
2nd rowDongjak-gu
3rd rowYeongdeungpo-gu
4th rowEunpyeong-gu
5th rowJungnang-gu

Common Values

ValueCountFrequency (%)
Yeongdeungpo-gu 15
 
10.2%
Yangcheon-gu 10
 
6.8%
Nowon-gu 9
 
6.1%
Gangseo-gu 8
 
5.4%
Gangnam-gu 8
 
5.4%
Jungnang-gu 8
 
5.4%
Gangdong-gu 8
 
5.4%
Songpa-gu 7
 
4.8%
Guro-gu 7
 
4.8%
Dongdaemun-gu 7
 
4.8%
Other values (15) 60
40.8%

Length

2023-12-11T17:41:13.423736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
yeongdeungpo-gu 15
 
10.2%
yangcheon-gu 10
 
6.8%
nowon-gu 9
 
6.1%
gangseo-gu 8
 
5.4%
gangnam-gu 8
 
5.4%
jungnang-gu 8
 
5.4%
gangdong-gu 8
 
5.4%
songpa-gu 7
 
4.8%
guro-gu 7
 
4.8%
dongdaemun-gu 7
 
4.8%
Other values (15) 60
40.8%
Distinct101
Distinct (%)68.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-11T17:41:13.724731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length12.673469
Min length9

Characters and Unicode

Total characters1863
Distinct characters42
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)48.3%

Sample

1st rowMok4-dong
2nd rowSangdo2-dong
3rd rowDaerim3-dong
4th rowYeokchon-dong
5th rowMyeonmokbon-dong
ValueCountFrequency (%)
mok1-dong 5
 
3.4%
junggye2.3-dong 5
 
3.4%
yeongdeungpo-dong 5
 
3.4%
gasan-dong 4
 
2.7%
yangjae2-dong 4
 
2.7%
cheonho2-dong 3
 
2.0%
munjeong2-dong 3
 
2.0%
banghwa2-dong 3
 
2.0%
seogyo-dong 2
 
1.4%
yangpyeong1-dong 2
 
1.4%
Other values (91) 111
75.5%
2023-12-11T17:41:14.258209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 305
16.4%
o 267
14.3%
g 263
14.1%
d 163
 
8.7%
- 147
 
7.9%
a 102
 
5.5%
e 90
 
4.8%
u 42
 
2.3%
2 38
 
2.0%
1 35
 
1.9%
Other values (32) 411
22.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1455
78.1%
Dash Punctuation 147
 
7.9%
Uppercase Letter 147
 
7.9%
Decimal Number 106
 
5.7%
Other Punctuation 8
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 305
21.0%
o 267
18.4%
g 263
18.1%
d 163
11.2%
a 102
 
7.0%
e 90
 
6.2%
u 42
 
2.9%
h 34
 
2.3%
i 29
 
2.0%
y 28
 
1.9%
Other values (10) 132
9.1%
Uppercase Letter
ValueCountFrequency (%)
S 30
20.4%
M 20
13.6%
G 17
11.6%
Y 17
11.6%
J 16
10.9%
D 10
 
6.8%
C 9
 
6.1%
B 9
 
6.1%
H 8
 
5.4%
W 4
 
2.7%
Other values (4) 7
 
4.8%
Decimal Number
ValueCountFrequency (%)
2 38
35.8%
1 35
33.0%
3 20
18.9%
5 6
 
5.7%
4 6
 
5.7%
6 1
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1602
86.0%
Common 261
 
14.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 305
19.0%
o 267
16.7%
g 263
16.4%
d 163
10.2%
a 102
 
6.4%
e 90
 
5.6%
u 42
 
2.6%
h 34
 
2.1%
S 30
 
1.9%
i 29
 
1.8%
Other values (24) 277
17.3%
Common
ValueCountFrequency (%)
- 147
56.3%
2 38
 
14.6%
1 35
 
13.4%
3 20
 
7.7%
. 8
 
3.1%
5 6
 
2.3%
4 6
 
2.3%
6 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1863
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 305
16.4%
o 267
14.3%
g 263
14.1%
d 163
 
8.7%
- 147
 
7.9%
a 102
 
5.5%
e 90
 
4.8%
u 42
 
2.3%
2 38
 
2.0%
1 35
 
1.9%
Other values (32) 411
22.1%

Correlations

2023-12-11T17:41:14.399186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
명칭행정구
명칭1.0000.679
행정구0.6791.000

Missing values

2023-12-11T17:41:11.114209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T17:41:11.269333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

명칭행정시행정구행정동
0BE_IW17-0001OK Discount MartSeoulYangcheon-guMok4-dong
1BE_IW17-0002New MartSeoulDongjak-guSangdo2-dong
2BE_IW17-0003SupermarketSeoulYeongdeungpo-guDaerim3-dong
3BE_IW17-0004Doyeong Discount MartSeoulEunpyeong-guYeokchon-dong
4BE_IW17-0005Dongwon Discount MartSeoulJungnang-guMyeonmokbon-dong
5BE_IW17-0006Mario Mario JumSeoulGeumcheon-guGasan-dong
6BE_IW17-0007Mega MartSeoulDongjak-guSindaebang2-dong
7BE_IW17-0008Samcheonri Bicycle MangujeomSeoulJungnang-guMangubon-dong
8BE_IW17-0009Sinhan SuperMarketSeoulNowon-guSanggye5-dong
9BE_IW17-0010LG Discount MartSeoulYangcheon-guMok2-dong
명칭행정시행정구행정동
137BE_IW17-0138New Core OutletSeoulSeocho-guBanpo3-dong
138BE_IW17-0139New Core OutletSeoulSeocho-guBanpo3-dong
139BE_IW17-01402001 OutletSeoulGuro-guGocheok1-dong
140BE_IW17-01412001 OutletSeoulYeongdeungpo-guDangsan2-dong
141BE_IW17-01422001 OutletSeoulYeongdeungpo-guYeongdeungpo-dong
142BE_IW17-01432001 OutletSeoulEunpyeong-guDaejo-dong
143BE_IW17-01442001 OutletSeoulNowon-guJunggye2.3-dong
144BE_IW17-01452001 OutletSeoulNowon-guJunggye2.3-dong
145BE_IW17-01462001 OutletSeoulNowon-guJunggye2.3-dong
146BE_IW17-0147Lotte Outlet Seoul Station ShopSeoulJung-guHoehyeon-dong