Overview

Dataset statistics

Number of variables5
Number of observations250
Missing cells10
Missing cells (%)0.8%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory9.9 KiB
Average record size in memory40.5 B

Variable types

Unsupported1
Text4

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2791/S/1/datasetView.do

Alerts

Dataset has 1 (0.4%) duplicate rowsDuplicates
서울특별시 지식산업센터 현황(1955~2021) 2021.5.31. 기준 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 06:17:04.771142
Analysis finished2023-12-11 06:17:05.613744
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Missing2
Missing (%)0.8%
Memory size2.1 KiB
Distinct245
Distinct (%)98.8%
Missing2
Missing (%)0.8%
Memory size2.1 KiB
2023-12-11T15:17:05.850347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length15
Mean length8.5725806
Min length2

Characters and Unicode

Total characters2126
Distinct characters252
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique242 ?
Unique (%)97.6%

Sample

1st row지식산업센터명
2nd row더리즌밸리
3rd rowLG아파트형공장
4th row건우정공
5th row에스티엑스브이타워
ValueCountFrequency (%)
서울숲 7
 
2.2%
에이스테크노타워 7
 
2.2%
sk 5
 
1.5%
문정 4
 
1.2%
아이에스비즈타워 3
 
0.9%
선유도 3
 
0.9%
1차 3
 
0.9%
한화비즈메트로 3
 
0.9%
v1 3
 
0.9%
대륭테크노타운 3
 
0.9%
Other values (269) 282
87.3%
2023-12-11T15:17:06.310862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
116
 
5.5%
92
 
4.3%
87
 
4.1%
87
 
4.1%
75
 
3.5%
67
 
3.2%
49
 
2.3%
45
 
2.1%
45
 
2.1%
42
 
2.0%
Other values (242) 1421
66.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1801
84.7%
Uppercase Letter 103
 
4.8%
Decimal Number 86
 
4.0%
Space Separator 75
 
3.5%
Lowercase Letter 21
 
1.0%
Close Punctuation 13
 
0.6%
Open Punctuation 13
 
0.6%
Dash Punctuation 7
 
0.3%
Letter Number 4
 
0.2%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
116
 
6.4%
92
 
5.1%
87
 
4.8%
87
 
4.8%
67
 
3.7%
49
 
2.7%
45
 
2.5%
45
 
2.5%
42
 
2.3%
36
 
2.0%
Other values (191) 1135
63.0%
Uppercase Letter
ValueCountFrequency (%)
I 21
20.4%
T 17
16.5%
K 11
10.7%
S 7
 
6.8%
C 5
 
4.9%
N 4
 
3.9%
A 4
 
3.9%
B 4
 
3.9%
H 4
 
3.9%
V 3
 
2.9%
Other values (14) 23
22.3%
Lowercase Letter
ValueCountFrequency (%)
e 7
33.3%
n 3
14.3%
r 3
14.3%
t 2
 
9.5%
o 1
 
4.8%
w 1
 
4.8%
v 1
 
4.8%
k 1
 
4.8%
s 1
 
4.8%
c 1
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 26
30.2%
2 24
27.9%
3 8
 
9.3%
5 6
 
7.0%
7 6
 
7.0%
8 5
 
5.8%
6 5
 
5.8%
0 3
 
3.5%
9 3
 
3.5%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
/ 1
33.3%
Letter Number
ValueCountFrequency (%)
2
50.0%
2
50.0%
Space Separator
ValueCountFrequency (%)
75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1801
84.7%
Common 197
 
9.3%
Latin 128
 
6.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
116
 
6.4%
92
 
5.1%
87
 
4.8%
87
 
4.8%
67
 
3.7%
49
 
2.7%
45
 
2.5%
45
 
2.5%
42
 
2.3%
36
 
2.0%
Other values (191) 1135
63.0%
Latin
ValueCountFrequency (%)
I 21
16.4%
T 17
 
13.3%
K 11
 
8.6%
e 7
 
5.5%
S 7
 
5.5%
C 5
 
3.9%
N 4
 
3.1%
A 4
 
3.1%
B 4
 
3.1%
H 4
 
3.1%
Other values (26) 44
34.4%
Common
ValueCountFrequency (%)
75
38.1%
1 26
 
13.2%
2 24
 
12.2%
) 13
 
6.6%
( 13
 
6.6%
3 8
 
4.1%
- 7
 
3.6%
5 6
 
3.0%
7 6
 
3.0%
8 5
 
2.5%
Other values (5) 14
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1801
84.7%
ASCII 321
 
15.1%
Number Forms 4
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
116
 
6.4%
92
 
5.1%
87
 
4.8%
87
 
4.8%
67
 
3.7%
49
 
2.7%
45
 
2.5%
45
 
2.5%
42
 
2.3%
36
 
2.0%
Other values (191) 1135
63.0%
ASCII
ValueCountFrequency (%)
75
23.4%
1 26
 
8.1%
2 24
 
7.5%
I 21
 
6.5%
T 17
 
5.3%
) 13
 
4.0%
( 13
 
4.0%
K 11
 
3.4%
3 8
 
2.5%
e 7
 
2.2%
Other values (39) 106
33.0%
Number Forms
ValueCountFrequency (%)
2
50.0%
2
50.0%
Distinct195
Distinct (%)78.6%
Missing2
Missing (%)0.8%
Memory size2.1 KiB
2023-12-11T15:17:06.571017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length20
Mean length9.4677419
Min length3

Characters and Unicode

Total characters2348
Distinct characters252
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)68.1%

Sample

1st row회사명
2nd row (주)하나자산신탁
3rd row(주)LG
4th row(주)건우정공
5th row(주)골드랜드리치
ValueCountFrequency (%)
주식회사 10
 
3.5%
대륭종합건설(주 8
 
2.8%
주)하나자산신탁 6
 
2.1%
한국자산신탁(주 5
 
1.7%
에이스종합건설(주 5
 
1.7%
운영위원회 5
 
1.7%
중소기업진흥공단 4
 
1.4%
대한토지신탁(주 4
 
1.4%
케이비부동산신탁(주 4
 
1.4%
주)우림휴먼텍 4
 
1.4%
Other values (205) 232
80.8%
2023-12-11T15:17:07.008928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
189
 
8.0%
( 162
 
6.9%
) 162
 
6.9%
75
 
3.2%
49
 
2.1%
49
 
2.1%
46
 
2.0%
45
 
1.9%
44
 
1.9%
43
 
1.8%
Other values (242) 1484
63.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1933
82.3%
Open Punctuation 162
 
6.9%
Close Punctuation 162
 
6.9%
Space Separator 40
 
1.7%
Uppercase Letter 28
 
1.2%
Decimal Number 20
 
0.9%
Dash Punctuation 2
 
0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
189
 
9.8%
75
 
3.9%
49
 
2.5%
49
 
2.5%
46
 
2.4%
45
 
2.3%
44
 
2.3%
43
 
2.2%
36
 
1.9%
36
 
1.9%
Other values (218) 1321
68.3%
Uppercase Letter
ValueCountFrequency (%)
K 5
17.9%
S 5
17.9%
T 3
10.7%
I 3
10.7%
B 2
 
7.1%
C 2
 
7.1%
E 2
 
7.1%
O 1
 
3.6%
G 1
 
3.6%
A 1
 
3.6%
Other values (3) 3
10.7%
Decimal Number
ValueCountFrequency (%)
1 6
30.0%
8 4
20.0%
2 4
20.0%
5 3
15.0%
6 2
 
10.0%
7 1
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 162
100.0%
Close Punctuation
ValueCountFrequency (%)
) 162
100.0%
Space Separator
ValueCountFrequency (%)
40
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1933
82.3%
Common 386
 
16.4%
Latin 29
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
189
 
9.8%
75
 
3.9%
49
 
2.5%
49
 
2.5%
46
 
2.4%
45
 
2.3%
44
 
2.3%
43
 
2.2%
36
 
1.9%
36
 
1.9%
Other values (218) 1321
68.3%
Latin
ValueCountFrequency (%)
K 5
17.2%
S 5
17.2%
T 3
10.3%
I 3
10.3%
B 2
 
6.9%
C 2
 
6.9%
E 2
 
6.9%
O 1
 
3.4%
G 1
 
3.4%
A 1
 
3.4%
Other values (4) 4
13.8%
Common
ValueCountFrequency (%)
( 162
42.0%
) 162
42.0%
40
 
10.4%
1 6
 
1.6%
8 4
 
1.0%
2 4
 
1.0%
5 3
 
0.8%
6 2
 
0.5%
- 2
 
0.5%
7 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1933
82.3%
ASCII 415
 
17.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
189
 
9.8%
75
 
3.9%
49
 
2.5%
49
 
2.5%
46
 
2.4%
45
 
2.3%
44
 
2.3%
43
 
2.2%
36
 
1.9%
36
 
1.9%
Other values (218) 1321
68.3%
ASCII
ValueCountFrequency (%)
( 162
39.0%
) 162
39.0%
40
 
9.6%
1 6
 
1.4%
K 5
 
1.2%
S 5
 
1.2%
8 4
 
1.0%
2 4
 
1.0%
T 3
 
0.7%
I 3
 
0.7%
Other values (14) 21
 
5.1%
Distinct229
Distinct (%)92.3%
Missing2
Missing (%)0.8%
Memory size2.1 KiB
2023-12-11T15:17:07.371468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9879032
Min length5

Characters and Unicode

Total characters1981
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)86.3%

Sample

1st row설립승인일
2nd row20180528
3rd row20090424
4th row20080627
5th row20100929
ValueCountFrequency (%)
20010201 4
 
1.6%
20030520 3
 
1.2%
20100712 3
 
1.2%
19990807 2
 
0.8%
20051006 2
 
0.8%
20040713 2
 
0.8%
20160224 2
 
0.8%
20120926 2
 
0.8%
20020730 2
 
0.8%
20151211 2
 
0.8%
Other values (219) 224
90.3%
2023-12-11T15:17:07.941568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 698
35.2%
2 419
21.2%
1 330
16.7%
9 109
 
5.5%
7 80
 
4.0%
8 75
 
3.8%
3 70
 
3.5%
5 69
 
3.5%
6 64
 
3.2%
4 62
 
3.1%
Other values (5) 5
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1976
99.7%
Other Letter 5
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 698
35.3%
2 419
21.2%
1 330
16.7%
9 109
 
5.5%
7 80
 
4.0%
8 75
 
3.8%
3 70
 
3.5%
5 69
 
3.5%
6 64
 
3.2%
4 62
 
3.1%
Other Letter
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1976
99.7%
Hangul 5
 
0.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 698
35.3%
2 419
21.2%
1 330
16.7%
9 109
 
5.5%
7 80
 
4.0%
8 75
 
3.8%
3 70
 
3.5%
5 69
 
3.5%
6 64
 
3.2%
4 62
 
3.1%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1976
99.7%
Hangul 5
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 698
35.3%
2 419
21.2%
1 330
16.7%
9 109
 
5.5%
7 80
 
4.0%
8 75
 
3.8%
3 70
 
3.5%
5 69
 
3.5%
6 64
 
3.2%
4 62
 
3.1%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Distinct246
Distinct (%)99.2%
Missing2
Missing (%)0.8%
Memory size2.1 KiB
2023-12-11T15:17:08.371789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length71
Median length45
Mean length34.548387
Min length6

Characters and Unicode

Total characters8568
Distinct characters255
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique244 ?
Unique (%)98.4%

Sample

1st row공장대표주소
2nd row서울특별시 금천구 가산로9길 66 (가산동)
3rd row서울특별시 금천구 가산디지털1로 189 (가산동, (주)LG 가산 디지털센터)
4th row서울특별시 금천구 디지털로 177 (가산동, (주)건우정공)
5th row서울특별시 금천구 가산디지털1로 128 (가산동, STX V-TOWER)
ValueCountFrequency (%)
서울특별시 247
 
15.8%
금천구 92
 
5.9%
가산동 85
 
5.4%
성동구 50
 
3.2%
구로구 42
 
2.7%
구로동 41
 
2.6%
성수동2가 40
 
2.6%
가산디지털1로 34
 
2.2%
영등포구 24
 
1.5%
21
 
1.3%
Other values (551) 886
56.7%
2023-12-11T15:17:08.953689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1324
 
15.5%
347
 
4.0%
314
 
3.7%
298
 
3.5%
271
 
3.2%
) 268
 
3.1%
( 268
 
3.1%
261
 
3.0%
253
 
3.0%
253
 
3.0%
Other values (245) 4711
55.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5266
61.5%
Space Separator 1324
 
15.5%
Decimal Number 1097
 
12.8%
Close Punctuation 268
 
3.1%
Open Punctuation 268
 
3.1%
Other Punctuation 175
 
2.0%
Uppercase Letter 96
 
1.1%
Dash Punctuation 58
 
0.7%
Letter Number 9
 
0.1%
Lowercase Letter 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
347
 
6.6%
314
 
6.0%
298
 
5.7%
271
 
5.1%
261
 
5.0%
253
 
4.8%
253
 
4.8%
252
 
4.8%
232
 
4.4%
199
 
3.8%
Other values (200) 2586
49.1%
Uppercase Letter
ValueCountFrequency (%)
T 14
14.6%
I 14
14.6%
L 12
12.5%
B 11
11.5%
K 6
 
6.2%
E 5
 
5.2%
S 5
 
5.2%
O 4
 
4.2%
R 4
 
4.2%
W 4
 
4.2%
Other values (11) 17
17.7%
Decimal Number
ValueCountFrequency (%)
1 242
22.1%
2 209
19.1%
3 137
12.5%
6 100
9.1%
5 90
 
8.2%
4 88
 
8.0%
7 73
 
6.7%
8 59
 
5.4%
9 53
 
4.8%
0 46
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
e 2
28.6%
n 2
28.6%
c 1
14.3%
t 1
14.3%
r 1
14.3%
Letter Number
ValueCountFrequency (%)
4
44.4%
4
44.4%
1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
, 174
99.4%
/ 1
 
0.6%
Space Separator
ValueCountFrequency (%)
1324
100.0%
Close Punctuation
ValueCountFrequency (%)
) 268
100.0%
Open Punctuation
ValueCountFrequency (%)
( 268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5266
61.5%
Common 3190
37.2%
Latin 112
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
347
 
6.6%
314
 
6.0%
298
 
5.7%
271
 
5.1%
261
 
5.0%
253
 
4.8%
253
 
4.8%
252
 
4.8%
232
 
4.4%
199
 
3.8%
Other values (200) 2586
49.1%
Latin
ValueCountFrequency (%)
T 14
12.5%
I 14
12.5%
L 12
 
10.7%
B 11
 
9.8%
K 6
 
5.4%
E 5
 
4.5%
S 5
 
4.5%
4
 
3.6%
4
 
3.6%
O 4
 
3.6%
Other values (19) 33
29.5%
Common
ValueCountFrequency (%)
1324
41.5%
) 268
 
8.4%
( 268
 
8.4%
1 242
 
7.6%
2 209
 
6.6%
, 174
 
5.5%
3 137
 
4.3%
6 100
 
3.1%
5 90
 
2.8%
4 88
 
2.8%
Other values (6) 290
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5266
61.5%
ASCII 3293
38.4%
Number Forms 9
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1324
40.2%
) 268
 
8.1%
( 268
 
8.1%
1 242
 
7.3%
2 209
 
6.3%
, 174
 
5.3%
3 137
 
4.2%
6 100
 
3.0%
5 90
 
2.7%
4 88
 
2.7%
Other values (32) 393
 
11.9%
Hangul
ValueCountFrequency (%)
347
 
6.6%
314
 
6.0%
298
 
5.7%
271
 
5.1%
261
 
5.0%
253
 
4.8%
253
 
4.8%
252
 
4.8%
232
 
4.4%
199
 
3.8%
Other values (200) 2586
49.1%
Number Forms
ValueCountFrequency (%)
4
44.4%
4
44.4%
1
 
11.1%

Missing values

2023-12-11T15:17:05.267265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:17:05.396767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T15:17:05.522986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

서울특별시 지식산업센터 현황(1955~2021) 2021.5.31. 기준Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
0NaN<NA><NA><NA><NA>
1NaN<NA><NA><NA><NA>
2순번지식산업센터명회사명설립승인일공장대표주소
31더리즌밸리(주)하나자산신탁20180528서울특별시 금천구 가산로9길 66 (가산동)
42LG아파트형공장(주)LG20090424서울특별시 금천구 가산디지털1로 189 (가산동, (주)LG 가산 디지털센터)
53건우정공(주)건우정공20080627서울특별시 금천구 디지털로 177 (가산동, (주)건우정공)
64에스티엑스브이타워(주)골드랜드리치20100929서울특별시 금천구 가산디지털1로 128 (가산동, STX V-TOWER)
75한라시그마밸리(주)국도테크노타운20091020서울특별시 금천구 가산디지털2로 53 (가산동, 한라시그마밸리)
86남성프라자(에이스테크노9차)(주)남성텔레콤20050822서울특별시 금천구 디지털로 130 (가산동, 남성프라자 (에이스9))
97(주)동광인터내셔날(주)동광인터내셔날20051123서울특별시 금천구 가산디지털2로 144 (가산동, 창고)
서울특별시 지식산업센터 현황(1955~2021) 2021.5.31. 기준Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
240238하우스디비즈한국자산신탁(주)20100809서울특별시 영등포구 선유로3길 10 (문래동5가) 하우스디 비즈
241239송파 테라타워2한국자산신탁(주)20141030서울특별시 송파구 문정동 642번지 (문정지구 특별계획구역 1-1BL)
242240에이스성수타워1한국자산신탁(주)20131001서울특별시 성동구 광나루로8길 6 (성수동2가)
243241엠스테이트한국자산신탁(주)20160830서울특별시 송파구 문정동 643-1번지 (문정지구 특별계획구역2)
244242에이스하이엔드타워10차한국자산신탁주식회사20140227서울특별시 금천구 가산디지털1로 30 (가산동, 4동)
245243한국전자협동한국전자협동운영위원회20070813서울특별시 금천구 가산디지털2로 114 (가산동, 한국전자협동아파트형공장)
246244에이스한솔타워한솔프린팅(주)20191230서울특별시 금천구 가산디지털1로 58, 에이스한솔타워 (가산동)
247245한신IT타워한신IT입주자협의20010426서울특별시 구로구 디지털로 272 (구로동, 한신아이티타워)
248246서울숲한라시그마밸리II향우실업(주)20130522서울특별시 성동구 성수이로7길 7 (성수동2가)
249247호서대벤처타워호서대벤처타워관리단20080912서울특별시 금천구 가산디지털1로 70 (가산동, 호서대벤처타워)

Duplicate rows

Most frequently occurring

Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4# duplicates
0<NA><NA><NA><NA>2