Overview

Dataset statistics

Number of variables5
Number of observations1104
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory43.3 KiB
Average record size in memory40.1 B

Variable types

Text3
Categorical2

Dataset

Description키,명칭,행정시,행정구,행정동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13042/S/1/datasetView.do

Alerts

행정시 has constant value ""Constant
has unique valuesUnique

Reproduction

Analysis started2023-12-11 04:05:36.007218
Analysis finished2023-12-11 04:05:36.340859
Duration0.33 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

UNIQUE 

Distinct1104
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-11T13:05:36.516462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters13248
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1104 ?
Unique (%)100.0%

Sample

1st rowBE_IW16-0908
2nd rowBE_IW16-0909
3rd rowBE_IW16-0910
4th rowBE_IW16-0911
5th rowBE_IW16-0912
ValueCountFrequency (%)
be_iw16-0908 1
 
0.1%
be_iw16-0462 1
 
0.1%
be_iw16-0457 1
 
0.1%
be_iw16-0458 1
 
0.1%
be_iw16-0459 1
 
0.1%
be_iw16-0460 1
 
0.1%
be_iw16-0461 1
 
0.1%
be_iw16-0454 1
 
0.1%
be_iw16-0464 1
 
0.1%
be_iw16-0453 1
 
0.1%
Other values (1094) 1094
99.1%
2023-12-11T13:05:36.957069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1535
11.6%
6 1424
10.7%
0 1422
10.7%
B 1104
8.3%
E 1104
8.3%
_ 1104
8.3%
I 1104
8.3%
W 1104
8.3%
- 1104
8.3%
2 321
 
2.4%
Other values (6) 1922
14.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6624
50.0%
Uppercase Letter 4416
33.3%
Connector Punctuation 1104
 
8.3%
Dash Punctuation 1104
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1535
23.2%
6 1424
21.5%
0 1422
21.5%
2 321
 
4.8%
4 321
 
4.8%
3 321
 
4.8%
9 320
 
4.8%
8 320
 
4.8%
7 320
 
4.8%
5 320
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
B 1104
25.0%
E 1104
25.0%
I 1104
25.0%
W 1104
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1104
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8832
66.7%
Latin 4416
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1535
17.4%
6 1424
16.1%
0 1422
16.1%
_ 1104
12.5%
- 1104
12.5%
2 321
 
3.6%
4 321
 
3.6%
3 321
 
3.6%
9 320
 
3.6%
8 320
 
3.6%
Other values (2) 640
7.2%
Latin
ValueCountFrequency (%)
B 1104
25.0%
E 1104
25.0%
I 1104
25.0%
W 1104
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1535
11.6%
6 1424
10.7%
0 1422
10.7%
B 1104
8.3%
E 1104
8.3%
_ 1104
8.3%
I 1104
8.3%
W 1104
8.3%
- 1104
8.3%
2 321
 
2.4%
Other values (6) 1922
14.5%

명칭
Text

Distinct752
Distinct (%)68.1%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-11T13:05:37.361898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length38
Mean length14.237319
Min length3

Characters and Unicode

Total characters15718
Distinct characters61
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique572 ?
Unique (%)51.8%

Sample

1st rowChoegojip Restaurant
2nd rowChoegojip Hand-rolled Noodle
3rd rowChupungnyeong Gamjatang
4th rowChupungnyeong Gamjatang
5th rowChunjane Daegutang
ValueCountFrequency (%)
restaurant 173
 
8.6%
soup 31
 
1.5%
garden 24
 
1.2%
galbi 23
 
1.1%
sikdang 19
 
0.9%
jeongju 17
 
0.8%
noodles 17
 
0.8%
mapo 16
 
0.8%
rice 15
 
0.7%
namwon 14
 
0.7%
Other values (893) 1652
82.6%
2023-12-11T13:05:37.955954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1709
 
10.9%
n 1501
 
9.5%
o 1213
 
7.7%
e 1170
 
7.4%
g 928
 
5.9%
916
 
5.8%
u 813
 
5.2%
i 669
 
4.3%
t 578
 
3.7%
s 503
 
3.2%
Other values (51) 5718
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12594
80.1%
Uppercase Letter 1986
 
12.6%
Space Separator 916
 
5.8%
Dash Punctuation 64
 
0.4%
Open Punctuation 63
 
0.4%
Close Punctuation 62
 
0.4%
Other Punctuation 26
 
0.2%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1709
13.6%
n 1501
11.9%
o 1213
9.6%
e 1170
 
9.3%
g 928
 
7.4%
u 813
 
6.5%
i 669
 
5.3%
t 578
 
4.6%
s 503
 
4.0%
r 492
 
3.9%
Other values (15) 3018
24.0%
Uppercase Letter
ValueCountFrequency (%)
S 296
14.9%
R 200
10.1%
G 197
9.9%
H 172
8.7%
B 143
 
7.2%
C 133
 
6.7%
M 131
 
6.6%
J 120
 
6.0%
D 97
 
4.9%
Y 79
 
4.0%
Other values (13) 418
21.0%
Other Punctuation
ValueCountFrequency (%)
. 15
57.7%
& 8
30.8%
? 1
 
3.8%
1
 
3.8%
/ 1
 
3.8%
Decimal Number
ValueCountFrequency (%)
6 2
28.6%
1 2
28.6%
9 2
28.6%
2 1
14.3%
Space Separator
ValueCountFrequency (%)
916
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 64
100.0%
Open Punctuation
ValueCountFrequency (%)
( 63
100.0%
Close Punctuation
ValueCountFrequency (%)
) 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14580
92.8%
Common 1138
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1709
 
11.7%
n 1501
 
10.3%
o 1213
 
8.3%
e 1170
 
8.0%
g 928
 
6.4%
u 813
 
5.6%
i 669
 
4.6%
t 578
 
4.0%
s 503
 
3.4%
r 492
 
3.4%
Other values (38) 5004
34.3%
Common
ValueCountFrequency (%)
916
80.5%
- 64
 
5.6%
( 63
 
5.5%
) 62
 
5.4%
. 15
 
1.3%
& 8
 
0.7%
6 2
 
0.2%
1 2
 
0.2%
9 2
 
0.2%
? 1
 
0.1%
Other values (3) 3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15717
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1709
 
10.9%
n 1501
 
9.6%
o 1213
 
7.7%
e 1170
 
7.4%
g 928
 
5.9%
916
 
5.8%
u 813
 
5.2%
i 669
 
4.3%
t 578
 
3.7%
s 503
 
3.2%
Other values (50) 5717
36.4%
None
ValueCountFrequency (%)
1
100.0%

행정시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
Seoul
1104 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeoul
2nd rowSeoul
3rd rowSeoul
4th rowSeoul
5th rowSeoul

Common Values

ValueCountFrequency (%)
Seoul 1104
100.0%

Length

2023-12-11T13:05:38.162329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T13:05:38.273442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
seoul 1104
100.0%

행정구
Categorical

Distinct25
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
Gangnam-gu
153 
Seocho-gu
93 
Jongno-gu
82 
Gangbuk-gu
67 
Jung-gu
 
58
Other values (20)
651 

Length

Max length15
Median length13
Mean length10.015399
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSeocho-gu
2nd rowEunpyeong-gu
3rd rowGangdong-gu
4th rowJongno-gu
5th rowMapo-gu

Common Values

ValueCountFrequency (%)
Gangnam-gu 153
 
13.9%
Seocho-gu 93
 
8.4%
Jongno-gu 82
 
7.4%
Gangbuk-gu 67
 
6.1%
Jung-gu 58
 
5.3%
Songpa-gu 57
 
5.2%
Mapo-gu 49
 
4.4%
Yeongdeungpo-gu 45
 
4.1%
Gwanak-gu 43
 
3.9%
Seongbuk-gu 42
 
3.8%
Other values (15) 415
37.6%

Length

2023-12-11T13:05:38.390632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gangnam-gu 153
 
13.9%
seocho-gu 93
 
8.4%
jongno-gu 82
 
7.4%
gangbuk-gu 67
 
6.1%
jung-gu 58
 
5.3%
songpa-gu 57
 
5.2%
mapo-gu 49
 
4.4%
yeongdeungpo-gu 45
 
4.1%
gwanak-gu 43
 
3.9%
seongbuk-gu 42
 
3.8%
Other values (15) 415
37.6%
Distinct321
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
2023-12-11T13:05:38.637024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length12.5
Min length7

Characters and Unicode

Total characters13800
Distinct characters48
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)8.9%

Sample

1st rowSeocho3-dong
2nd rowYeokchon-dong
3rd rowSeongnae2-dong
4th rowJongno1.2.3.4ga-dong
5th rowSeogyo-dong
ValueCountFrequency (%)
ui-dong 31
 
2.8%
jongno1.2.3.4ga-dong 25
 
2.3%
yeoksam1-dong 25
 
2.3%
seocho3-dong 20
 
1.8%
cheongdam-dong 20
 
1.8%
myeong-dong 16
 
1.4%
nonhyeon2-dong 16
 
1.4%
apgujeong-dong 14
 
1.3%
naegok-dong 13
 
1.2%
seogyo-dong 12
 
1.1%
Other values (311) 912
82.6%
2023-12-11T13:05:39.002131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 2201
15.9%
o 2039
14.8%
g 1909
13.8%
d 1173
 
8.5%
- 1104
 
8.0%
a 744
 
5.4%
e 642
 
4.7%
h 318
 
2.3%
S 280
 
2.0%
i 270
 
2.0%
Other values (38) 3120
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10828
78.5%
Dash Punctuation 1104
 
8.0%
Uppercase Letter 1104
 
8.0%
Decimal Number 679
 
4.9%
Other Punctuation 85
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2201
20.3%
o 2039
18.8%
g 1909
17.6%
d 1173
10.8%
a 744
 
6.9%
e 642
 
5.9%
h 318
 
2.9%
i 270
 
2.5%
u 267
 
2.5%
k 189
 
1.7%
Other values (11) 1076
9.9%
Uppercase Letter
ValueCountFrequency (%)
S 280
25.4%
Y 109
 
9.9%
J 107
 
9.7%
G 97
 
8.8%
D 80
 
7.2%
H 72
 
6.5%
B 66
 
6.0%
C 57
 
5.2%
N 56
 
5.1%
M 54
 
4.9%
Other values (7) 126
11.4%
Decimal Number
ValueCountFrequency (%)
1 254
37.4%
2 213
31.4%
3 101
 
14.9%
4 74
 
10.9%
5 18
 
2.7%
6 13
 
1.9%
7 4
 
0.6%
8 2
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 1104
100.0%
Other Punctuation
ValueCountFrequency (%)
. 85
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11932
86.5%
Common 1868
 
13.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2201
18.4%
o 2039
17.1%
g 1909
16.0%
d 1173
9.8%
a 744
 
6.2%
e 642
 
5.4%
h 318
 
2.7%
S 280
 
2.3%
i 270
 
2.3%
u 267
 
2.2%
Other values (28) 2089
17.5%
Common
ValueCountFrequency (%)
- 1104
59.1%
1 254
 
13.6%
2 213
 
11.4%
3 101
 
5.4%
. 85
 
4.6%
4 74
 
4.0%
5 18
 
1.0%
6 13
 
0.7%
7 4
 
0.2%
8 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2201
15.9%
o 2039
14.8%
g 1909
13.8%
d 1173
 
8.5%
- 1104
 
8.0%
a 744
 
5.4%
e 642
 
4.7%
h 318
 
2.3%
S 280
 
2.0%
i 270
 
2.0%
Other values (38) 3120
22.6%

Missing values

2023-12-11T13:05:36.202480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:05:36.303897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

명칭행정시행정구행정동
0BE_IW16-0908Choegojip RestaurantSeoulSeocho-guSeocho3-dong
1BE_IW16-0909Choegojip Hand-rolled NoodleSeoulEunpyeong-guYeokchon-dong
2BE_IW16-0910Chupungnyeong GamjatangSeoulGangdong-guSeongnae2-dong
3BE_IW16-0911Chupungnyeong GamjatangSeoulJongno-guJongno1.2.3.4ga-dong
4BE_IW16-0912Chunjane DaegutangSeoulMapo-guSeogyo-dong
5BE_IW16-0913ChuchunokSeoulGeumcheon-guGasan-dong
6BE_IW16-0914Chuncheon Wang Spicy Grilled ChickenSeoulDongdaemun-guHoegi-dong
7BE_IW16-0915ChuncheonjipSeoulJongno-guPyeongchang-dong
8BE_IW16-0916ChuncheonjipSeoulYangcheon-guMok1-dong
9BE_IW16-0917ChunhachudongSeoulYongsan-guHyochang-dong
명칭행정시행정구행정동
1094BE_IW16-0279Mapo GalbiSeoulGuro-guGuro4-dong
1095BE_IW16-0280Mapo GalbiSeoulGangnam-guNonhyeon1-dong
1096BE_IW16-0281Mapo GalbiSeoulYangcheon-guSinwol7-dong
1097BE_IW16-0282Mapo GalbiSeoulGangbuk-guSamgaksan-dong
1098BE_IW16-0283Mapo GalbiSeoulGwanak-guEuncheon-dong
1099BE_IW16-0284Mapo NaruSeoulMapo-guSinsu-dong
1100BE_IW16-0285Mapo Sogeum-guiSeoulGangnam-guDaechi2-dong
1101BE_IW16-0286Mapo Sogeum-guiSeoulMapo-guHapjeong-dong
1102BE_IW16-0287Mapo Sogeum-guiSeoulJungnang-guMuk2-dong
1103BE_IW16-0288Mapo SutbulgalbiSeoulSongpa-guPungnap2-dong