Overview

Dataset statistics

Number of variables5
Number of observations198
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.1 KiB
Average record size in memory41.7 B

Variable types

Numeric1
Categorical1
Text2
DateTime1

Dataset

Description서울교통공사에 관리하는 특허 정보현황 자료입니다. 해당 데이터는 연번, 산업재산권 종류, 발명의 명칭,등록번호,등록일자로 구성되어 있습니다. 해당 데이터는 변경 사항이 있을시 업데이트 예정입니다.
Author서울교통공사
URLhttps://www.data.go.kr/data/15024824/fileData.do

Alerts

연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
등록번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 00:11:50.396832
Analysis finished2023-12-12 00:11:50.856835
Duration0.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct198
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.5
Minimum1
Maximum198
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T09:11:50.925691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10.85
Q150.25
median99.5
Q3148.75
95-th percentile188.15
Maximum198
Range197
Interquartile range (IQR)98.5

Descriptive statistics

Standard deviation57.301832
Coefficient of variation (CV)0.57589781
Kurtosis-1.2
Mean99.5
Median Absolute Deviation (MAD)49.5
Skewness0
Sum19701
Variance3283.5
MonotonicityStrictly increasing
2023-12-12T09:11:51.063841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
126 1
 
0.5%
128 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
Other values (188) 188
94.9%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%
191 1
0.5%
190 1
0.5%
189 1
0.5%

구분
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
특허
82 
상표
75 
서비스표
24 
디자인
14 
업무표장
 
2

Length

Max length4
Median length2
Mean length2.3434343
Min length2

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row특허
2nd row특허
3rd row특허
4th row특허
5th row특허

Common Values

ValueCountFrequency (%)
특허 82
41.4%
상표 75
37.9%
서비스표 24
 
12.1%
디자인 14
 
7.1%
업무표장 2
 
1.0%
실용신안 1
 
0.5%

Length

2023-12-12T09:11:51.204144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:11:51.325385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
특허 82
41.4%
상표 75
37.9%
서비스표 24
 
12.1%
디자인 14
 
7.1%
업무표장 2
 
1.0%
실용신안 1
 
0.5%
Distinct196
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T09:11:51.659917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length37
Mean length19.560606
Min length5

Characters and Unicode

Total characters3873
Distinct characters309
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique194 ?
Unique (%)98.0%

Sample

1st row레일의 체결구조
2nd row철도 레일 콘크리트도상 궤도구조 및 그 시공방법
3rd row지하철 역사 양방향 비상 게이트
4th row스크린도어용 헤드박스의 보강구조
5th row스크린도어용 조립식 수직 포스트
ValueCountFrequency (%)
31
 
3.6%
30
 
3.5%
27
 
3.1%
bi 26
 
3.0%
시스템 23
 
2.7%
방법 16
 
1.9%
seoul 16
 
1.9%
이용한 16
 
1.9%
장치 15
 
1.7%
39류 13
 
1.5%
Other values (392) 645
75.2%
2023-12-12T09:11:52.536829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
665
 
17.2%
137
 
3.5%
78
 
2.0%
( 65
 
1.7%
) 65
 
1.7%
65
 
1.7%
3 63
 
1.6%
57
 
1.5%
54
 
1.4%
54
 
1.4%
Other values (299) 2570
66.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2219
57.3%
Space Separator 665
 
17.2%
Decimal Number 300
 
7.7%
Uppercase Letter 248
 
6.4%
Lowercase Letter 194
 
5.0%
Open Punctuation 99
 
2.6%
Close Punctuation 99
 
2.6%
Other Punctuation 45
 
1.2%
Dash Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
137
 
6.2%
78
 
3.5%
65
 
2.9%
57
 
2.6%
54
 
2.4%
54
 
2.4%
53
 
2.4%
52
 
2.3%
49
 
2.2%
48
 
2.2%
Other values (244) 1572
70.8%
Uppercase Letter
ValueCountFrequency (%)
S 47
19.0%
M 30
12.1%
T 29
11.7%
I 27
10.9%
B 27
10.9%
R 14
 
5.6%
O 13
 
5.2%
E 12
 
4.8%
L 10
 
4.0%
C 9
 
3.6%
Other values (7) 30
12.1%
Lowercase Letter
ValueCountFrequency (%)
e 44
22.7%
o 33
17.0%
r 17
 
8.8%
t 17
 
8.8%
l 14
 
7.2%
i 14
 
7.2%
u 12
 
6.2%
v 8
 
4.1%
y 8
 
4.1%
c 7
 
3.6%
Other values (6) 20
10.3%
Decimal Number
ValueCountFrequency (%)
3 63
21.0%
1 32
10.7%
2 31
10.3%
9 31
10.3%
7 30
10.0%
6 30
10.0%
5 25
 
8.3%
0 23
 
7.7%
4 19
 
6.3%
8 15
 
5.0%
Other Punctuation
ValueCountFrequency (%)
, 27
60.0%
/ 8
 
17.8%
: 8
 
17.8%
? 1
 
2.2%
· 1
 
2.2%
Open Punctuation
ValueCountFrequency (%)
( 65
65.7%
[ 34
34.3%
Close Punctuation
ValueCountFrequency (%)
) 65
65.7%
] 34
34.3%
Space Separator
ValueCountFrequency (%)
665
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2219
57.3%
Common 1212
31.3%
Latin 442
 
11.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
137
 
6.2%
78
 
3.5%
65
 
2.9%
57
 
2.6%
54
 
2.4%
54
 
2.4%
53
 
2.4%
52
 
2.3%
49
 
2.2%
48
 
2.2%
Other values (244) 1572
70.8%
Latin
ValueCountFrequency (%)
S 47
 
10.6%
e 44
 
10.0%
o 33
 
7.5%
M 30
 
6.8%
T 29
 
6.6%
I 27
 
6.1%
B 27
 
6.1%
r 17
 
3.8%
t 17
 
3.8%
l 14
 
3.2%
Other values (23) 157
35.5%
Common
ValueCountFrequency (%)
665
54.9%
( 65
 
5.4%
) 65
 
5.4%
3 63
 
5.2%
] 34
 
2.8%
[ 34
 
2.8%
1 32
 
2.6%
2 31
 
2.6%
9 31
 
2.6%
7 30
 
2.5%
Other values (12) 162
 
13.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2219
57.3%
ASCII 1652
42.7%
None 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
665
40.3%
( 65
 
3.9%
) 65
 
3.9%
3 63
 
3.8%
S 47
 
2.8%
e 44
 
2.7%
] 34
 
2.1%
[ 34
 
2.1%
o 33
 
2.0%
1 32
 
1.9%
Other values (43) 570
34.5%
Hangul
ValueCountFrequency (%)
137
 
6.2%
78
 
3.5%
65
 
2.9%
57
 
2.6%
54
 
2.4%
54
 
2.4%
53
 
2.4%
52
 
2.3%
49
 
2.2%
48
 
2.2%
Other values (244) 1572
70.8%
None
ValueCountFrequency (%)
· 1
50.0%
1
50.0%

등록번호
Text

UNIQUE 

Distinct198
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T09:11:52.797683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length16
Mean length15.969697
Min length10

Characters and Unicode

Total characters3162
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)100.0%

Sample

1st row10-0474255-00-00
2nd row10-0595429-00-00
3rd row10-0756641-00-00
4th row10-0826234-00-00
5th row10-0839484-00-00
ValueCountFrequency (%)
10-0474255-00-00 1
 
0.5%
41-0132221-00-00 1
 
0.5%
41-0132223-00-00 1
 
0.5%
41-0136992-00-00 1
 
0.5%
41-0132224-00-00 1
 
0.5%
41-0132225-00-00 1
 
0.5%
41-0132226-00-00 1
 
0.5%
41-0132227-00-00 1
 
0.5%
41-0170756-00-00 1
 
0.5%
41-0209370-00-00 1
 
0.5%
Other values (188) 188
94.9%
2023-12-12T09:11:53.197364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1133
35.8%
- 592
18.7%
1 366
 
11.6%
4 217
 
6.9%
2 170
 
5.4%
5 129
 
4.1%
7 119
 
3.8%
6 116
 
3.7%
3 111
 
3.5%
8 106
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2570
81.3%
Dash Punctuation 592
 
18.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1133
44.1%
1 366
 
14.2%
4 217
 
8.4%
2 170
 
6.6%
5 129
 
5.0%
7 119
 
4.6%
6 116
 
4.5%
3 111
 
4.3%
8 106
 
4.1%
9 103
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3162
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1133
35.8%
- 592
18.7%
1 366
 
11.6%
4 217
 
6.9%
2 170
 
5.4%
5 129
 
4.1%
7 119
 
3.8%
6 116
 
3.7%
3 111
 
3.5%
8 106
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1133
35.8%
- 592
18.7%
1 366
 
11.6%
4 217
 
6.9%
2 170
 
5.4%
5 129
 
4.1%
7 119
 
3.8%
6 116
 
3.7%
3 111
 
3.5%
8 106
 
3.4%
Distinct116
Distinct (%)58.6%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Minimum1995-12-29 00:00:00
Maximum2023-06-02 00:00:00
2023-12-12T09:11:53.370878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:11:53.499354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T09:11:50.621899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:11:53.608504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.846
구분0.8461.000
2023-12-12T09:11:53.724133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.651
구분0.6511.000

Missing values

2023-12-12T09:11:50.738251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:11:50.824035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분발명의명칭등록번호등록일자
01특허레일의 체결구조10-0474255-00-002005-02-22
12특허철도 레일 콘크리트도상 궤도구조 및 그 시공방법10-0595429-00-002006-06-23
23특허지하철 역사 양방향 비상 게이트10-0756641-00-002007-09-03
34특허스크린도어용 헤드박스의 보강구조10-0826234-00-002008-04-23
45특허스크린도어용 조립식 수직 포스트10-0839484-00-002008-06-12
56특허PSD조립체 모듈10-0912491-00-002009-08-10
67특허슬림형 자동집 개표기10-0956175-00-002010-04-27
78특허계통설비별 색상정보를 이용한 변전소 모니터링 시스템10-1008956-00-002011-01-11
89특허실시간 장애정보 수집 및 교통카드시스템 운영상황 디스플레이방식의 교통카드 원격정비시스템10-1057126-00-002011-08-09
910특허전동차 자동 기동장치 및 이를 이용한 전동차10-1060977-00-002011-08-25
연번구분발명의명칭등록번호등록일자
188189상표또타러기지 39류40-1643927-00-002020-09-16
189190상표T Luggage 또타러기지 39류40-1643928-00-002020-09-16
190191상표또타딜리버리 35류40-1686718-00-002021-01-26
191192상표또타딜리버리 39류40-1643929-00-002020-09-16
192193상표T Delivery 또타딜리버리 35류40-1686719-00-002021-01-26
193194상표T Delivery 또타딜리버리 39류40-1643930-00-002020-09-16
194195상표또타픽업 35류40-1686720-00-002021-01-26
195196상표또타픽업 39류40-1643931-00-002020-09-16
196197상표T Pick Up 또타픽업 35류40-1686721-00-002021-01-26
197198상표T Pick Up 또타픽업 39류40-1643932-00-002020-09-16