Overview

Dataset statistics

Number of variables5
Number of observations286
Missing cells294
Missing cells (%)20.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.3 KiB
Average record size in memory40.5 B

Variable types

Text4
Categorical1

Dataset

Description키,명칭,행정 시,행정 구,행정 동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13021/S/1/datasetView.do

Alerts

행정 구 has 147 (51.4%) missing valuesMissing
행정 동 has 147 (51.4%) missing valuesMissing
has unique valuesUnique

Reproduction

Analysis started2023-12-11 10:13:45.220326
Analysis finished2023-12-11 10:13:45.764708
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct286
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T19:13:46.015033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters3432
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique286 ?
Unique (%)100.0%

Sample

1st rowBE_IW04-0232
2nd rowBE_IW04-0233
3rd rowBE_IW04-0234
4th rowBE_IW04-0235
5th rowBE_IW04-0236
ValueCountFrequency (%)
be_iw04-0232 1
 
0.3%
be_iw04-0133 1
 
0.3%
be_iw04-0139 1
 
0.3%
be_iw04-0138 1
 
0.3%
be_iw04-0137 1
 
0.3%
be_iw04-0136 1
 
0.3%
be_iw04-0135 1
 
0.3%
be_iw04-0143 1
 
0.3%
be_iw04-0132 1
 
0.3%
be_iw04-0141 1
 
0.3%
Other values (276) 276
96.5%
2023-12-11T19:13:46.385092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 728
21.2%
4 345
10.1%
B 286
 
8.3%
E 286
 
8.3%
_ 286
 
8.3%
I 286
 
8.3%
W 286
 
8.3%
- 286
 
8.3%
1 159
 
4.6%
2 146
 
4.3%
Other values (6) 338
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1716
50.0%
Uppercase Letter 1144
33.3%
Connector Punctuation 286
 
8.3%
Dash Punctuation 286
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 728
42.4%
4 345
20.1%
1 159
 
9.3%
2 146
 
8.5%
3 59
 
3.4%
5 59
 
3.4%
6 59
 
3.4%
7 58
 
3.4%
8 55
 
3.2%
9 48
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
B 286
25.0%
E 286
25.0%
I 286
25.0%
W 286
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 286
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2288
66.7%
Latin 1144
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 728
31.8%
4 345
15.1%
_ 286
 
12.5%
- 286
 
12.5%
1 159
 
6.9%
2 146
 
6.4%
3 59
 
2.6%
5 59
 
2.6%
6 59
 
2.6%
7 58
 
2.5%
Other values (2) 103
 
4.5%
Latin
ValueCountFrequency (%)
B 286
25.0%
E 286
25.0%
I 286
25.0%
W 286
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3432
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 728
21.2%
4 345
10.1%
B 286
 
8.3%
E 286
 
8.3%
_ 286
 
8.3%
I 286
 
8.3%
W 286
 
8.3%
- 286
 
8.3%
1 159
 
4.6%
2 146
 
4.3%
Other values (6) 338
9.8%

명칭
Text

Distinct207
Distinct (%)72.4%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T19:13:46.609259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length36
Mean length25.545455
Min length7

Characters and Unicode

Total characters7306
Distinct characters65
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique168 ?
Unique (%)58.7%

Sample

1st rowSword in the Moon Filming site
2nd rowSword in the Moon Filming site
3rd rowCheonghaejin Port
4th rowChowon Photo Studio
5th rowFilming site
ValueCountFrequency (%)
site 118
 
9.9%
filming 117
 
9.9%
set 105
 
8.9%
for 103
 
8.7%
flim 65
 
5.5%
films 33
 
2.8%
the 29
 
2.4%
of 22
 
1.9%
studio 11
 
0.9%
park 11
 
0.9%
Other values (302) 572
48.2%
2023-12-11T19:13:47.000414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
906
 
12.4%
i 674
 
9.2%
e 662
 
9.1%
n 455
 
6.2%
o 446
 
6.1%
m 364
 
5.0%
s 360
 
4.9%
l 355
 
4.9%
t 339
 
4.6%
a 335
 
4.6%
Other values (55) 2410
33.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5575
76.3%
Space Separator 906
 
12.4%
Uppercase Letter 749
 
10.3%
Other Punctuation 25
 
0.3%
Decimal Number 22
 
0.3%
Close Punctuation 11
 
0.2%
Open Punctuation 11
 
0.2%
Dash Punctuation 7
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 674
12.1%
e 662
11.9%
n 455
 
8.2%
o 446
 
8.0%
m 364
 
6.5%
s 360
 
6.5%
l 355
 
6.4%
t 339
 
6.1%
a 335
 
6.0%
g 321
 
5.8%
Other values (16) 1264
22.7%
Uppercase Letter
ValueCountFrequency (%)
F 141
18.8%
S 118
15.8%
G 54
 
7.2%
T 48
 
6.4%
B 46
 
6.1%
M 39
 
5.2%
D 37
 
4.9%
C 30
 
4.0%
H 28
 
3.7%
O 28
 
3.7%
Other values (13) 180
24.0%
Decimal Number
ValueCountFrequency (%)
1 8
36.4%
2 4
18.2%
0 3
 
13.6%
3 3
 
13.6%
5 2
 
9.1%
9 1
 
4.5%
4 1
 
4.5%
Other Punctuation
ValueCountFrequency (%)
' 23
92.0%
. 1
 
4.0%
& 1
 
4.0%
Close Punctuation
ValueCountFrequency (%)
) 7
63.6%
] 4
36.4%
Open Punctuation
ValueCountFrequency (%)
( 7
63.6%
[ 4
36.4%
Space Separator
ValueCountFrequency (%)
906
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6324
86.6%
Common 982
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 674
 
10.7%
e 662
 
10.5%
n 455
 
7.2%
o 446
 
7.1%
m 364
 
5.8%
s 360
 
5.7%
l 355
 
5.6%
t 339
 
5.4%
a 335
 
5.3%
g 321
 
5.1%
Other values (39) 2013
31.8%
Common
ValueCountFrequency (%)
906
92.3%
' 23
 
2.3%
1 8
 
0.8%
) 7
 
0.7%
- 7
 
0.7%
( 7
 
0.7%
2 4
 
0.4%
] 4
 
0.4%
[ 4
 
0.4%
0 3
 
0.3%
Other values (6) 9
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7306
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
906
 
12.4%
i 674
 
9.2%
e 662
 
9.1%
n 455
 
6.2%
o 446
 
6.1%
m 364
 
5.0%
s 360
 
4.9%
l 355
 
4.9%
t 339
 
4.6%
a 335
 
4.6%
Other values (55) 2410
33.0%

행정 시
Categorical

Distinct12
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
<NA>
147 
Gyeonggi-do
20 
Gyeongsangnam-do
19 
Gyeongsangbuk-do
18 
Jeollanam-do
17 
Other values (7)
65 

Length

Max length17
Median length4
Mean length8.1013986
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJeollanam-do
2nd row<NA>
3rd rowJeollanam-do
4th row<NA>
5th rowGyeonggi-do

Common Values

ValueCountFrequency (%)
<NA> 147
51.4%
Gyeonggi-do 20
 
7.0%
Gyeongsangnam-do 19
 
6.6%
Gyeongsangbuk-do 18
 
6.3%
Jeollanam-do 17
 
5.9%
Jeju-do 16
 
5.6%
Gangwon-do 12
 
4.2%
Jeollabuk-do 10
 
3.5%
Chungcheongbuk-do 10
 
3.5%
Chungcheongnam-do 8
 
2.8%
Other values (2) 9
 
3.1%

Length

2023-12-11T19:13:47.178572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 147
51.4%
gyeonggi-do 20
 
7.0%
gyeongsangnam-do 19
 
6.6%
gyeongsangbuk-do 18
 
6.3%
jeollanam-do 17
 
5.9%
jeju-do 16
 
5.6%
gangwon-do 12
 
4.2%
jeollabuk-do 10
 
3.5%
chungcheongbuk-do 10
 
3.5%
chungcheongnam-do 8
 
2.8%
Other values (2) 9
 
3.1%

행정 구
Text

MISSING 

Distinct63
Distinct (%)45.3%
Missing147
Missing (%)51.4%
Memory size2.4 KiB
2023-12-11T19:13:47.440984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length10.841727
Min length7

Characters and Unicode

Total characters1507
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)23.0%

Sample

1st rowDamyang-gun
2nd rowWando-gun
3rd rowYangju-si
4th rowNamwon-si
5th rowHadong-gun
ValueCountFrequency (%)
namyangju-si 9
 
6.2%
jeju-si 9
 
6.2%
seogwipo-si 7
 
4.8%
sancheon-gun 7
 
4.8%
wando-gun 6
 
4.1%
jecheon-si 6
 
4.1%
pyeongchang-gun 5
 
3.4%
mungyeong-si 4
 
2.7%
buyeo-gun 4
 
2.7%
wonmi-gu 3
 
2.1%
Other values (57) 86
58.9%
2023-12-11T19:13:47.838386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 225
14.9%
g 170
11.3%
- 146
9.7%
u 130
8.6%
o 106
 
7.0%
e 98
 
6.5%
a 98
 
6.5%
i 88
 
5.8%
s 81
 
5.4%
h 46
 
3.1%
Other values (28) 319
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1208
80.2%
Dash Punctuation 146
 
9.7%
Uppercase Letter 146
 
9.7%
Space Separator 7
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 225
18.6%
g 170
14.1%
u 130
10.8%
o 106
8.8%
e 98
8.1%
a 98
8.1%
i 88
 
7.3%
s 81
 
6.7%
h 46
 
3.8%
c 33
 
2.7%
Other values (9) 133
11.0%
Uppercase Letter
ValueCountFrequency (%)
S 22
15.1%
J 22
15.1%
N 14
9.6%
B 14
9.6%
G 11
7.5%
Y 11
7.5%
W 9
6.2%
H 8
 
5.5%
P 7
 
4.8%
C 6
 
4.1%
Other values (7) 22
15.1%
Dash Punctuation
ValueCountFrequency (%)
- 146
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1354
89.8%
Common 153
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 225
16.6%
g 170
12.6%
u 130
9.6%
o 106
 
7.8%
e 98
 
7.2%
a 98
 
7.2%
i 88
 
6.5%
s 81
 
6.0%
h 46
 
3.4%
c 33
 
2.4%
Other values (26) 279
20.6%
Common
ValueCountFrequency (%)
- 146
95.4%
7
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1507
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 225
14.9%
g 170
11.3%
- 146
9.7%
u 130
8.6%
o 106
 
7.0%
e 98
 
6.5%
a 98
 
6.5%
i 88
 
5.8%
s 81
 
5.4%
h 46
 
3.1%
Other values (28) 319
21.2%

행정 동
Text

MISSING 

Distinct91
Distinct (%)65.5%
Missing147
Missing (%)51.4%
Memory size2.4 KiB
2023-12-11T19:13:48.090717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length17
Mean length11.935252
Min length7

Characters and Unicode

Total characters1659
Distinct characters42
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)46.0%

Sample

1st rowGeumseong-myeon
2nd rowWando-eup
3rd rowYangju2-dong
4th rowNoam-dong
5th rowJingyo-myeon
ValueCountFrequency (%)
joan-myeon 8
 
5.8%
chahwang-myeon 6
 
4.3%
geumseong-myeon 4
 
2.9%
daegwallyeong-myeon 4
 
2.9%
gujwa-eup 4
 
2.9%
chunghwa-myeon 3
 
2.2%
mungyeong-eup 3
 
2.2%
sang3-dong 3
 
2.2%
wando-eup 3
 
2.2%
bukdo-myeon 3
 
2.2%
Other values (81) 98
70.5%
2023-12-11T19:13:48.495772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 243
14.6%
o 194
11.7%
e 187
11.3%
- 139
8.4%
g 112
 
6.8%
y 112
 
6.8%
m 108
 
6.5%
a 101
 
6.1%
u 95
 
5.7%
p 38
 
2.3%
Other values (32) 330
19.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1374
82.8%
Dash Punctuation 139
 
8.4%
Uppercase Letter 139
 
8.4%
Decimal Number 7
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 243
17.7%
o 194
14.1%
e 187
13.6%
g 112
8.2%
y 112
8.2%
m 108
7.9%
a 101
7.4%
u 95
 
6.9%
p 38
 
2.8%
h 32
 
2.3%
Other values (11) 152
11.1%
Uppercase Letter
ValueCountFrequency (%)
G 18
12.9%
J 17
12.2%
S 16
11.5%
C 15
10.8%
B 15
10.8%
Y 11
7.9%
D 8
5.8%
H 7
 
5.0%
N 7
 
5.0%
W 6
 
4.3%
Other values (7) 19
13.7%
Decimal Number
ValueCountFrequency (%)
1 3
42.9%
3 3
42.9%
2 1
 
14.3%
Dash Punctuation
ValueCountFrequency (%)
- 139
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1513
91.2%
Common 146
 
8.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 243
16.1%
o 194
12.8%
e 187
12.4%
g 112
 
7.4%
y 112
 
7.4%
m 108
 
7.1%
a 101
 
6.7%
u 95
 
6.3%
p 38
 
2.5%
h 32
 
2.1%
Other values (28) 291
19.2%
Common
ValueCountFrequency (%)
- 139
95.2%
1 3
 
2.1%
3 3
 
2.1%
2 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1659
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 243
14.6%
o 194
11.7%
e 187
11.3%
- 139
8.4%
g 112
 
6.8%
y 112
 
6.8%
m 108
 
6.5%
a 101
 
6.1%
u 95
 
5.7%
p 38
 
2.3%
Other values (32) 330
19.9%

Correlations

2023-12-11T19:13:48.600433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정 시행정 구행정 동
행정 시1.0001.0000.999
행정 구1.0001.0001.000
행정 동0.9991.0001.000

Missing values

2023-12-11T19:13:45.548601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T19:13:45.632750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T19:13:45.714903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

명칭행정 시행정 구행정 동
0BE_IW04-0232Sword in the Moon Filming siteJeollanam-doDamyang-gunGeumseong-myeon
1BE_IW04-0233Sword in the Moon Filming site<NA><NA><NA>
2BE_IW04-0234Cheonghaejin PortJeollanam-doWando-gunWando-eup
3BE_IW04-0235Chowon Photo Studio<NA><NA><NA>
4BE_IW04-0236Filming siteGyeonggi-doYangju-siYangju2-dong
5BE_IW04-0237Chunhyangdyeon Filming site<NA><NA><NA>
6BE_IW04-0238Chunhyangdyeon Filming siteJeollabuk-doNamwon-siNoam-dong
7BE_IW04-0239Chwihwaseon Filming siteGyeongsangnam-doHadong-gunJingyo-myeon
8BE_IW04-0240Chwihwaseon Filming siteJeollanam-doSuncheon-siSeungju-eup
9BE_IW04-0241Friend Filming siteBusanSaha-guHadan1-dong
명칭행정 시행정 구행정 동
276BE_IW04-0222Jukseong Filming site<NA><NA><NA>
277BE_IW04-0223Jurassic studio<NA><NA><NA>
278BE_IW04-0224Stairway to Heaven Filming siteIncheonJung-guYongyu-dong
279BE_IW04-0225Heaven's SoldiersGyeongsangnam-doSancheon-gunChahwang-myeon
280BE_IW04-0226Beyond the Years Filming siteJeollanam-doJangheung-gunHoejin-myeon
281BE_IW04-0227Beyond the Years Filming site<NA><NA><NA>
282BE_IW04-0228Cheongseokkol Filming site<NA><NA><NA>
283BE_IW04-0229Springtime Filming siteGyeongsangnam-doHadong-gunAgyang-myeon
284BE_IW04-0230Springtime Filming site<NA><NA><NA>
285BE_IW04-0231Springtime Filming site<NA><NA><NA>