Overview

Dataset statistics

Number of variables5
Number of observations240
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.5 KiB
Average record size in memory40.5 B

Variable types

Text3
Categorical2

Dataset

Description메인 키,명칭,행정 시,행정 구,행정 동
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-13078/S/1/datasetView.do

Alerts

행정 시 has constant value ""Constant
메인 키 has unique valuesUnique

Reproduction

Analysis started2023-12-11 04:21:52.087599
Analysis finished2023-12-11 04:21:52.715281
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

메인 키
Text

UNIQUE 

Distinct240
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-11T13:21:52.945819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters3360
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique240 ?
Unique (%)100.0%

Sample

1st rowBE_LiST36-0040
2nd rowBE_LiST36-0041
3rd rowBE_LiST36-0042
4th rowBE_LiST36-0043
5th rowBE_LiST36-0044
ValueCountFrequency (%)
be_list36-0040 1
 
0.4%
be_list36-0041 1
 
0.4%
be_list36-0204 1
 
0.4%
be_list36-0192 1
 
0.4%
be_list36-0193 1
 
0.4%
be_list36-0194 1
 
0.4%
be_list36-0195 1
 
0.4%
be_list36-0196 1
 
0.4%
be_list36-0197 1
 
0.4%
be_list36-0198 1
 
0.4%
Other values (230) 230
95.8%
2023-12-11T13:21:53.410208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 392
11.7%
3 294
8.8%
6 284
 
8.5%
B 240
 
7.1%
T 240
 
7.1%
E 240
 
7.1%
- 240
 
7.1%
S 240
 
7.1%
i 240
 
7.1%
L 240
 
7.1%
Other values (8) 710
21.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1440
42.9%
Uppercase Letter 1200
35.7%
Dash Punctuation 240
 
7.1%
Lowercase Letter 240
 
7.1%
Connector Punctuation 240
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 392
27.2%
3 294
20.4%
6 284
19.7%
1 154
 
10.7%
2 95
 
6.6%
4 45
 
3.1%
5 44
 
3.1%
7 44
 
3.1%
8 44
 
3.1%
9 44
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
B 240
20.0%
T 240
20.0%
E 240
20.0%
S 240
20.0%
L 240
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 240
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 240
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 240
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1920
57.1%
Latin 1440
42.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 392
20.4%
3 294
15.3%
6 284
14.8%
- 240
12.5%
_ 240
12.5%
1 154
 
8.0%
2 95
 
4.9%
4 45
 
2.3%
5 44
 
2.3%
7 44
 
2.3%
Other values (2) 88
 
4.6%
Latin
ValueCountFrequency (%)
B 240
16.7%
T 240
16.7%
E 240
16.7%
S 240
16.7%
i 240
16.7%
L 240
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 392
11.7%
3 294
8.8%
6 284
 
8.5%
B 240
 
7.1%
T 240
 
7.1%
E 240
 
7.1%
- 240
 
7.1%
S 240
 
7.1%
i 240
 
7.1%
L 240
 
7.1%
Other values (8) 710
21.1%

명칭
Text

Distinct224
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-11T13:21:53.714072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length5.8375
Min length2

Characters and Unicode

Total characters1401
Distinct characters295
Distinct categories7 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique216 ?
Unique (%)90.0%

Sample

1st row?阿?廊
2nd row?丹?廊
3rd row?米埃?廊
4th row?安?廊
5th row摩登?廊
ValueCountFrequency (%)
9
 
3.7%
7
 
2.9%
代?廊 4
 
1.7%
文化中心 3
 
1.2%
放送 2
 
0.8%
廊秀 2
 
0.8%
林匹克 1
 
0.4%
1
 
0.4%
post 1
 
0.4%
田?色小 1
 
0.4%
Other values (211) 211
87.2%
2023-12-11T13:21:54.166582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
? 526
37.5%
61
 
4.4%
37
 
2.6%
36
 
2.6%
29
 
2.1%
28
 
2.0%
28
 
2.0%
26
 
1.9%
26
 
1.9%
15
 
1.1%
Other values (285) 589
42.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 767
54.7%
Other Punctuation 527
37.6%
Uppercase Letter 68
 
4.9%
Lowercase Letter 31
 
2.2%
Decimal Number 5
 
0.4%
Space Separator 2
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
61
 
8.0%
37
 
4.8%
36
 
4.7%
29
 
3.8%
28
 
3.7%
28
 
3.7%
26
 
3.4%
26
 
3.4%
15
 
2.0%
10
 
1.3%
Other values (242) 471
61.4%
Uppercase Letter
ValueCountFrequency (%)
S 10
14.7%
B 7
10.3%
M 6
 
8.8%
K 5
 
7.4%
T 5
 
7.4%
A 4
 
5.9%
C 4
 
5.9%
L 4
 
5.9%
P 3
 
4.4%
N 3
 
4.4%
Other values (12) 17
25.0%
Lowercase Letter
ValueCountFrequency (%)
o 5
16.1%
e 4
12.9%
m 3
9.7%
i 3
9.7%
s 2
 
6.5%
l 2
 
6.5%
a 2
 
6.5%
t 2
 
6.5%
u 2
 
6.5%
n 2
 
6.5%
Other values (4) 4
12.9%
Decimal Number
ValueCountFrequency (%)
5 2
40.0%
1 2
40.0%
3 1
20.0%
Other Punctuation
ValueCountFrequency (%)
? 526
99.8%
& 1
 
0.2%
Space Separator
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 764
54.5%
Common 535
38.2%
Latin 99
 
7.1%
Hangul 3
 
0.2%

Most frequent character per script

Han
ValueCountFrequency (%)
61
 
8.0%
37
 
4.8%
36
 
4.7%
29
 
3.8%
28
 
3.7%
28
 
3.7%
26
 
3.4%
26
 
3.4%
15
 
2.0%
10
 
1.3%
Other values (240) 468
61.3%
Latin
ValueCountFrequency (%)
S 10
 
10.1%
B 7
 
7.1%
M 6
 
6.1%
K 5
 
5.1%
T 5
 
5.1%
o 5
 
5.1%
A 4
 
4.0%
e 4
 
4.0%
C 4
 
4.0%
L 4
 
4.0%
Other values (26) 45
45.5%
Common
ValueCountFrequency (%)
? 526
98.3%
2
 
0.4%
5 2
 
0.4%
1 2
 
0.4%
3 1
 
0.2%
+ 1
 
0.2%
& 1
 
0.2%
Hangul
ValueCountFrequency (%)
2
66.7%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
CJK 763
54.5%
ASCII 634
45.3%
Hangul 3
 
0.2%
CJK Compat Ideographs 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
? 526
83.0%
S 10
 
1.6%
B 7
 
1.1%
M 6
 
0.9%
K 5
 
0.8%
T 5
 
0.8%
o 5
 
0.8%
A 4
 
0.6%
e 4
 
0.6%
C 4
 
0.6%
Other values (33) 58
 
9.1%
CJK
ValueCountFrequency (%)
61
 
8.0%
37
 
4.8%
36
 
4.7%
29
 
3.8%
28
 
3.7%
28
 
3.7%
26
 
3.4%
26
 
3.4%
15
 
2.0%
10
 
1.3%
Other values (239) 467
61.2%
Hangul
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

행정 시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
首?特?市
240 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row首?特?市
2nd row首?特?市
3rd row首?特?市
4th row首?特?市
5th row首?特?市

Common Values

ValueCountFrequency (%)
首?特?市 240
100.0%

Length

2023-12-11T13:21:54.323529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T13:21:54.425535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
首?特?市 240
100.0%

행정 구
Categorical

Distinct23
Distinct (%)9.6%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
?路?
110 
江南?
31 
麻浦?
13 
中?
12 
?山?
 
10
Other values (18)
64 

Length

Max length4
Median length3
Mean length2.9958333
Min length2

Unique

Unique4 ?
Unique (%)1.7%

Sample

1st row?路?
2nd row中?
3rd row?路?
4th row?路?
5th row?路?

Common Values

ValueCountFrequency (%)
?路? 110
45.8%
江南? 31
 
12.9%
麻浦? 13
 
5.4%
中? 12
 
5.0%
?山? 10
 
4.2%
瑞草? 9
 
3.8%
松坡? 8
 
3.3%
城北? 6
 
2.5%
永登浦? 6
 
2.5%
?川? 5
 
2.1%
Other values (13) 30
 
12.5%

Length

2023-12-11T13:21:54.565201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
110
45.8%
江南 31
 
12.9%
麻浦 13
 
5.4%
12
 
5.0%
10
 
4.2%
瑞草 9
 
3.8%
松坡 8
 
3.3%
城北 6
 
2.5%
永登浦 6
 
2.5%
5
 
2.1%
Other values (13) 30
 
12.5%
Distinct90
Distinct (%)37.5%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-11T13:21:54.838565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length3
Mean length4.0958333
Min length2

Characters and Unicode

Total characters983
Distinct characters107
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)26.2%

Sample

1st row嘉?洞
2nd row小公洞
3rd row社稷洞
4th row?云孝子洞
5th row?路1.2.3.4街洞
ValueCountFrequency (%)
路1.2.3.4街洞 23
 
9.6%
三?洞 20
 
8.3%
平?洞 18
 
7.5%
嘉?洞 10
 
4.2%
梨花洞 10
 
4.2%
社稷洞 9
 
3.8%
云孝子洞 8
 
3.3%
淸潭洞 8
 
3.3%
化洞 7
 
2.9%
西?洞 7
 
2.9%
Other values (79) 120
50.0%
2023-12-11T13:21:55.326135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
240
24.4%
? 138
14.0%
. 71
 
7.2%
2 49
 
5.0%
1 40
 
4.1%
3 28
 
2.8%
27
 
2.7%
4 26
 
2.6%
26
 
2.6%
25
 
2.5%
Other values (97) 313
31.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 627
63.8%
Other Punctuation 209
 
21.3%
Decimal Number 147
 
15.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
240
38.3%
27
 
4.3%
26
 
4.1%
25
 
4.0%
18
 
2.9%
13
 
2.1%
10
 
1.6%
10
 
1.6%
10
 
1.6%
西 9
 
1.4%
Other values (89) 239
38.1%
Decimal Number
ValueCountFrequency (%)
2 49
33.3%
1 40
27.2%
3 28
19.0%
4 26
17.7%
5 2
 
1.4%
6 2
 
1.4%
Other Punctuation
ValueCountFrequency (%)
? 138
66.0%
. 71
34.0%

Most occurring scripts

ValueCountFrequency (%)
Han 625
63.6%
Common 356
36.2%
Hangul 2
 
0.2%

Most frequent character per script

Han
ValueCountFrequency (%)
240
38.4%
27
 
4.3%
26
 
4.2%
25
 
4.0%
18
 
2.9%
13
 
2.1%
10
 
1.6%
10
 
1.6%
10
 
1.6%
西 9
 
1.4%
Other values (88) 237
37.9%
Common
ValueCountFrequency (%)
? 138
38.8%
. 71
19.9%
2 49
 
13.8%
1 40
 
11.2%
3 28
 
7.9%
4 26
 
7.3%
5 2
 
0.6%
6 2
 
0.6%
Hangul
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
CJK 625
63.6%
ASCII 356
36.2%
Hangul 2
 
0.2%

Most frequent character per block

CJK
ValueCountFrequency (%)
240
38.4%
27
 
4.3%
26
 
4.2%
25
 
4.0%
18
 
2.9%
13
 
2.1%
10
 
1.6%
10
 
1.6%
10
 
1.6%
西 9
 
1.4%
Other values (88) 237
37.9%
ASCII
ValueCountFrequency (%)
? 138
38.8%
. 71
19.9%
2 49
 
13.8%
1 40
 
11.2%
3 28
 
7.9%
4 26
 
7.3%
5 2
 
0.6%
6 2
 
0.6%
Hangul
ValueCountFrequency (%)
2
100.0%

Correlations

2023-12-11T13:21:55.456736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정 구행정 동
행정 구1.0001.000
행정 동1.0001.000

Missing values

2023-12-11T13:21:52.541248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:21:52.667389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

메인 키명칭행정 시행정 구행정 동
0BE_LiST36-0040?阿?廊首?特?市?路?嘉?洞
1BE_LiST36-0041?丹?廊首?特?市中?小公洞
2BE_LiST36-0042?米埃?廊首?特?市?路?社稷洞
3BE_LiST36-0043?安?廊首?特?市?路??云孝子洞
4BE_LiST36-0044摩登?廊首?特?市?路??路1.2.3.4街洞
5BE_LiST36-0045摹茵?廊首?特?市?路??路1.2.3.4街洞
6BE_LiST36-0046?幻首?特?市西大??新村洞
7BE_LiST36-0047文化??社首?特?市?路??路1.2.3.4街洞
8BE_LiST36-0048美林?房首?特?市?山??江路洞
9BE_LiST36-0049白慈恩?廊首?特?市?路?平?洞
메인 키명칭행정 시행정 구행정 동
230BE_LiST36-0030金珍藏品首?特?市?路?平?洞
231BE_LiST36-0031金?朱?廊首?特?市?路?三?洞
232BE_LiST36-0032蘆花?首?特?市?路??路1.2.3.4街洞
233BE_LiST36-0033多??廊首?特?市江南?新沙洞
234BE_LiST36-0034?石?廊首?特?市城北?城北洞
235BE_LiST36-0035大林美??首?特?市?路?社稷洞
236BE_LiST36-0036大林?廊首?特?市?路??路1.2.3.4街洞
237BE_LiST36-0037代案空?普?首?特?市?路?平?洞
238BE_LiST36-0038德元?廊首?特?市?路??路1.2.3.4街洞
239BE_LiST36-0039?山房?廊首?特?市?路??路1.2.3.4街洞