Overview

Dataset statistics

Number of variables5
Number of observations289
Missing cells3
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory40.5 B

Variable types

Unsupported2
Categorical1
Text2

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-13235/F/1/datasetView.do

Alerts

Unnamed: 4 has unique valuesUnique
서울교통공사 산업재산권 등록현황 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-04-29 16:47:35.402010
Analysis finished2024-04-29 16:47:36.034891
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

서울교통공사 산업재산권 등록현황
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.3%
Memory size2.4 KiB

Unnamed: 1
Categorical

Distinct9
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
특허
168 
상표
59 
서비스표
32 
디자인
 
13
실용신안
 
12
Other values (4)
 
5

Length

Max length7
Median length2
Mean length2.3875433
Min length2

Unique

Unique3 ?
Unique (%)1.0%

Sample

1st row<NA>
2nd row구분
3rd row특허
4th row특허 (통상)
5th row특허

Common Values

ValueCountFrequency (%)
특허 168
58.1%
상표 59
 
20.4%
서비스표 32
 
11.1%
디자인 13
 
4.5%
실용신안 12
 
4.2%
업무표장 2
 
0.7%
<NA> 1
 
0.3%
구분 1
 
0.3%
특허 (통상) 1
 
0.3%

Length

2024-04-30T01:47:36.095449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:47:36.208367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
특허 169
58.3%
상표 59
 
20.3%
서비스표 32
 
11.0%
디자인 13
 
4.5%
실용신안 12
 
4.1%
업무표장 2
 
0.7%
na 1
 
0.3%
구분 1
 
0.3%
통상 1
 
0.3%
Distinct252
Distinct (%)87.5%
Missing1
Missing (%)0.3%
Memory size2.4 KiB
2024-04-30T01:47:36.490280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length62
Median length44
Mean length20.631944
Min length4

Characters and Unicode

Total characters5942
Distinct characters404
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique242 ?
Unique (%)84.0%

Sample

1st row발명의 명칭
2nd row레일의 체결구조
3rd row콘크리트 표면처리용 조성물 및 표면처리제
4th row철도 레일 콘크리트도상 궤도구조 및 그 시공방법
5th row교량의 신축이음장치
ValueCountFrequency (%)
58
 
4.3%
41
 
3.0%
서울메트로 37
 
2.7%
37
 
2.7%
bi 37
 
2.7%
시스템 36
 
2.7%
이용한 32
 
2.4%
방법 28
 
2.1%
장치 20
 
1.5%
전동차 16
 
1.2%
Other values (686) 1007
74.6%
2024-04-30T01:47:36.893777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1074
 
18.1%
132
 
2.2%
112
 
1.9%
105
 
1.8%
101
 
1.7%
96
 
1.6%
96
 
1.6%
93
 
1.6%
88
 
1.5%
87
 
1.5%
Other values (394) 3958
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3878
65.3%
Space Separator 1078
 
18.1%
Decimal Number 286
 
4.8%
Uppercase Letter 226
 
3.8%
Lowercase Letter 181
 
3.0%
Close Punctuation 121
 
2.0%
Open Punctuation 121
 
2.0%
Other Punctuation 45
 
0.8%
Dash Punctuation 4
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
132
 
3.4%
112
 
2.9%
105
 
2.7%
101
 
2.6%
96
 
2.5%
96
 
2.5%
93
 
2.4%
88
 
2.3%
87
 
2.2%
87
 
2.2%
Other values (326) 2881
74.3%
Uppercase Letter
ValueCountFrequency (%)
I 40
17.7%
B 40
17.7%
S 33
14.6%
M 23
10.2%
T 17
7.5%
C 12
 
5.3%
A 9
 
4.0%
R 9
 
4.0%
D 7
 
3.1%
P 6
 
2.7%
Other values (13) 30
13.3%
Lowercase Letter
ValueCountFrequency (%)
e 34
18.8%
o 28
15.5%
t 22
12.2%
i 16
8.8%
l 15
8.3%
r 12
 
6.6%
u 11
 
6.1%
y 8
 
4.4%
v 6
 
3.3%
a 6
 
3.3%
Other values (8) 23
12.7%
Decimal Number
ValueCountFrequency (%)
3 54
18.9%
7 34
11.9%
1 32
11.2%
2 31
10.8%
6 30
10.5%
9 25
8.7%
0 23
8.0%
4 21
 
7.3%
5 19
 
6.6%
8 16
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 28
62.2%
/ 8
 
17.8%
: 6
 
13.3%
· 2
 
4.4%
1
 
2.2%
Close Punctuation
ValueCountFrequency (%)
) 70
57.9%
] 50
41.3%
1
 
0.8%
Open Punctuation
ValueCountFrequency (%)
( 70
57.9%
[ 50
41.3%
1
 
0.8%
Space Separator
ValueCountFrequency (%)
1074
99.6%
  4
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3878
65.3%
Common 1657
27.9%
Latin 407
 
6.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
132
 
3.4%
112
 
2.9%
105
 
2.7%
101
 
2.6%
96
 
2.5%
96
 
2.5%
93
 
2.4%
88
 
2.3%
87
 
2.2%
87
 
2.2%
Other values (326) 2881
74.3%
Latin
ValueCountFrequency (%)
I 40
 
9.8%
B 40
 
9.8%
e 34
 
8.4%
S 33
 
8.1%
o 28
 
6.9%
M 23
 
5.7%
t 22
 
5.4%
T 17
 
4.2%
i 16
 
3.9%
l 15
 
3.7%
Other values (31) 139
34.2%
Common
ValueCountFrequency (%)
1074
64.8%
) 70
 
4.2%
( 70
 
4.2%
3 54
 
3.3%
] 50
 
3.0%
[ 50
 
3.0%
7 34
 
2.1%
1 32
 
1.9%
2 31
 
1.9%
6 30
 
1.8%
Other values (17) 162
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3878
65.3%
ASCII 2044
34.4%
None 19
 
0.3%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1074
52.5%
) 70
 
3.4%
( 70
 
3.4%
3 54
 
2.6%
] 50
 
2.4%
[ 50
 
2.4%
I 40
 
2.0%
B 40
 
2.0%
e 34
 
1.7%
7 34
 
1.7%
Other values (47) 528
25.8%
Hangul
ValueCountFrequency (%)
132
 
3.4%
112
 
2.9%
105
 
2.7%
101
 
2.6%
96
 
2.5%
96
 
2.5%
93
 
2.4%
88
 
2.3%
87
 
2.2%
87
 
2.2%
Other values (326) 2881
74.3%
None
ValueCountFrequency (%)
  4
21.1%
3
15.8%
3
15.8%
· 2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

Unnamed: 3
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.3%
Memory size2.4 KiB

Unnamed: 4
Text

UNIQUE 

Distinct289
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-04-30T01:47:37.073650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length16
Mean length15.955017
Min length4

Characters and Unicode

Total characters4611
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289 ?
Unique (%)100.0%

Sample

1st row(2020.03.16.현재)
2nd row등록일자
3rd row10-0474255-00-00
4th row10-0566481-00-00
5th row10-0595429-00-00
ValueCountFrequency (%)
2020.03.16.현재 1
 
0.3%
10-1731414-00-00 1
 
0.3%
40-0538128-00-00 1
 
0.3%
41-0029928-00-00 1
 
0.3%
30-0905421-00-00 1
 
0.3%
30-0829524-00-00 1
 
0.3%
30-0819404-00-00 1
 
0.3%
30-0659713-00-00 1
 
0.3%
40-0538127-00-00 1
 
0.3%
30-0655271-00-00 1
 
0.3%
Other values (279) 279
96.5%
2024-04-30T01:47:37.408467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1693
36.7%
- 861
18.7%
1 572
 
12.4%
4 255
 
5.5%
2 232
 
5.0%
7 185
 
4.0%
5 184
 
4.0%
3 162
 
3.5%
9 161
 
3.5%
8 150
 
3.3%
Other values (10) 156
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3739
81.1%
Dash Punctuation 861
 
18.7%
Other Letter 6
 
0.1%
Other Punctuation 3
 
0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1693
45.3%
1 572
 
15.3%
4 255
 
6.8%
2 232
 
6.2%
7 185
 
4.9%
5 184
 
4.9%
3 162
 
4.3%
9 161
 
4.3%
8 150
 
4.0%
6 145
 
3.9%
Other Letter
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 861
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4605
99.9%
Hangul 6
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1693
36.8%
- 861
18.7%
1 572
 
12.4%
4 255
 
5.5%
2 232
 
5.0%
7 185
 
4.0%
5 184
 
4.0%
3 162
 
3.5%
9 161
 
3.5%
8 150
 
3.3%
Other values (4) 150
 
3.3%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4605
99.9%
Hangul 6
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1693
36.8%
- 861
18.7%
1 572
 
12.4%
4 255
 
5.5%
2 232
 
5.0%
7 185
 
4.0%
5 184
 
4.0%
3 162
 
3.5%
9 161
 
3.5%
8 150
 
3.3%
Other values (4) 150
 
3.3%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Missing values

2024-04-30T01:47:35.826446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T01:47:35.900729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T01:47:35.982928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

서울교통공사 산업재산권 등록현황Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
0NaN<NA><NA>NaN(2020.03.16.현재)
1연번구분발명의 명칭등록번호등록일자
21특허레일의 체결구조2005-02-22 00:00:0010-0474255-00-00
32특허 (통상)콘크리트 표면처리용 조성물 및 표면처리제2006-03-24 00:00:0010-0566481-00-00
43특허철도 레일 콘크리트도상 궤도구조 및 그 시공방법2006-06-23 00:00:0010-0595429-00-00
54특허교량의 신축이음장치2006-11-02 00:00:0010-0644143-00-00
65특허ATS 차상장치 오동작 방지시스템과 그를 이용한 방법2007-03-20 00:00:0010-0700199-00-00
76특허전동차 팬터그래프 집전마찰판 및 그 제조방법2007-06-21 00:00:0010-0733069-00-00
87특허지하철 역사 양방향 비상 게이트2007-09-03 00:00:0010-0756641-00-00
98특허고인장섬유를 이용한 고압 주입 방식의 콘크리트 균열 보수방법2007-10-10 00:00:0010-0767736-00-00
서울교통공사 산업재산권 등록현황Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
279278상표서울교통공사(6류)2019-10-17 00:00:0040-1532969-00-00
280279상표서울교통공사(7류)2019-10-17 00:00:0040-1532970-00-00
281280상표서울교통공사(9류)2019-10-17 00:00:0040-1532971-00-00
282281상표서울교통공사(11류)2019-10-17 00:00:0040-1532972-00-00
283282상표서울교통공사(12류)2019-10-17 00:00:0040-1532973-00-00
284283상표서울교통공사(35류)2019-10-17 00:00:0040-1532974-00-00
285284상표서울교통공사(36류)2019-10-17 00:00:0040-1532975-00-00
286285상표서울교통공사(37류)2019-10-17 00:00:0040-1532976-00-00
287286상표서울교통공사(39류)2019-10-17 00:00:0040-1532977-00-00
288287상표서울교통공사(42류)2019-10-17 00:00:0040-1532978-00-00