Overview

Dataset statistics

Number of variables4
Number of observations504
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows7
Duplicate rows (%)1.4%
Total size in memory15.9 KiB
Average record size in memory32.3 B

Variable types

Text4

Dataset

Description한국남동발전의 연구개발 용어집입니다. 연구개발 시 사용하는 한글용어에 따른 영문용어와 영어 약자, 그리고 용어의 의미 데이터를 포함하고 있습니다.
URLhttps://www.data.go.kr/data/15092393/fileData.do

Alerts

Dataset has 7 (1.4%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 21:26:39.070526
Analysis finished2023-12-12 21:26:39.971076
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct490
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T06:26:40.319274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length2
Mean length2.1309524
Min length1

Characters and Unicode

Total characters1074
Distinct characters279
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478 ?
Unique (%)94.8%

Sample

1st row타당
2nd row유사
3rd row비밀
4th row외부
5th row배경
ValueCountFrequency (%)
완성 3
 
0.6%
옵션 3
 
0.6%
발송 2
 
0.4%
직위 2
 
0.4%
2
 
0.4%
사업부 2
 
0.4%
사업 2
 
0.4%
시험 2
 
0.4%
수첩 2
 
0.4%
담당자 2
 
0.4%
Other values (480) 482
95.6%
2023-12-13T06:26:40.953786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
2.0%
21
 
2.0%
21
 
2.0%
17
 
1.6%
16
 
1.5%
16
 
1.5%
15
 
1.4%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (269) 902
84.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1046
97.4%
Lowercase Letter 20
 
1.9%
Decimal Number 5
 
0.5%
Uppercase Letter 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
2.1%
21
 
2.0%
21
 
2.0%
17
 
1.6%
16
 
1.5%
16
 
1.5%
15
 
1.4%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (258) 874
83.6%
Decimal Number
ValueCountFrequency (%)
1 1
20.0%
3 1
20.0%
4 1
20.0%
5 1
20.0%
2 1
20.0%
Lowercase Letter
ValueCountFrequency (%)
t 10
50.0%
e 5
25.0%
s 5
25.0%
Uppercase Letter
ValueCountFrequency (%)
L 1
33.3%
R 1
33.3%
U 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1046
97.4%
Latin 23
 
2.1%
Common 5
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
22
 
2.1%
21
 
2.0%
21
 
2.0%
17
 
1.6%
16
 
1.5%
16
 
1.5%
15
 
1.4%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (258) 874
83.6%
Latin
ValueCountFrequency (%)
t 10
43.5%
e 5
21.7%
s 5
21.7%
L 1
 
4.3%
R 1
 
4.3%
U 1
 
4.3%
Common
ValueCountFrequency (%)
1 1
20.0%
3 1
20.0%
4 1
20.0%
5 1
20.0%
2 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1046
97.4%
ASCII 28
 
2.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
22
 
2.1%
21
 
2.0%
21
 
2.0%
17
 
1.6%
16
 
1.5%
16
 
1.5%
15
 
1.4%
15
 
1.4%
15
 
1.4%
14
 
1.3%
Other values (258) 874
83.6%
ASCII
ValueCountFrequency (%)
t 10
35.7%
e 5
17.9%
s 5
17.9%
1 1
 
3.6%
L 1
 
3.6%
R 1
 
3.6%
U 1
 
3.6%
3 1
 
3.6%
4 1
 
3.6%
5 1
 
3.6%
Distinct422
Distinct (%)83.7%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T06:26:41.434066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length23
Mean length7.7222222
Min length2

Characters and Unicode

Total characters3892
Distinct characters37
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique358 ?
Unique (%)71.0%

Sample

1st rowadequate
2nd rowresemblance
3rd rowsecurity
4th rowouter
5th rowbackground
ValueCountFrequency (%)
business 8
 
1.4%
number 8
 
1.4%
job 7
 
1.2%
type 6
 
1.0%
name 6
 
1.0%
a 6
 
1.0%
of 5
 
0.9%
item 5
 
0.9%
company 4
 
0.7%
person 4
 
0.7%
Other values (419) 525
89.9%
2023-12-13T06:26:42.056250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 493
12.7%
t 355
 
9.1%
n 306
 
7.9%
a 289
 
7.4%
o 278
 
7.1%
i 277
 
7.1%
r 274
 
7.0%
s 236
 
6.1%
c 179
 
4.6%
p 141
 
3.6%
Other values (27) 1064
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3771
96.9%
Space Separator 111
 
2.9%
Decimal Number 5
 
0.1%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 493
13.1%
t 355
 
9.4%
n 306
 
8.1%
a 289
 
7.7%
o 278
 
7.4%
i 277
 
7.3%
r 274
 
7.3%
s 236
 
6.3%
c 179
 
4.7%
p 141
 
3.7%
Other values (16) 943
25.0%
Decimal Number
ValueCountFrequency (%)
5 1
20.0%
3 1
20.0%
4 1
20.0%
1 1
20.0%
2 1
20.0%
Open Punctuation
ValueCountFrequency (%)
( 1
50.0%
[ 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 1
50.0%
] 1
50.0%
Space Separator
ValueCountFrequency (%)
111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3771
96.9%
Common 121
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 493
13.1%
t 355
 
9.4%
n 306
 
8.1%
a 289
 
7.7%
o 278
 
7.4%
i 277
 
7.3%
r 274
 
7.3%
s 236
 
6.3%
c 179
 
4.7%
p 141
 
3.7%
Other values (16) 943
25.0%
Common
ValueCountFrequency (%)
111
91.7%
5 1
 
0.8%
3 1
 
0.8%
4 1
 
0.8%
1 1
 
0.8%
2 1
 
0.8%
- 1
 
0.8%
( 1
 
0.8%
) 1
 
0.8%
[ 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 493
12.7%
t 355
 
9.1%
n 306
 
7.9%
a 289
 
7.4%
o 278
 
7.1%
i 277
 
7.1%
r 274
 
7.0%
s 236
 
6.1%
c 179
 
4.6%
p 141
 
3.6%
Other values (27) 1064
27.3%
Distinct411
Distinct (%)81.7%
Missing1
Missing (%)0.2%
Memory size4.1 KiB
2023-12-13T06:26:42.518065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length4.5626243
Min length2

Characters and Unicode

Total characters2295
Distinct characters32
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique340 ?
Unique (%)67.6%

Sample

1st rowadequate
2nd rowrese
3rd rowscr
4th rowouter
5th rowbg
ValueCountFrequency (%)
cmp 4
 
0.8%
req 4
 
0.8%
nm 4
 
0.8%
item 4
 
0.8%
type 4
 
0.8%
exec 3
 
0.6%
biz 3
 
0.6%
send 3
 
0.6%
cost 3
 
0.6%
status 3
 
0.6%
Other values (401) 468
93.0%
2023-12-13T06:26:43.061241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 255
 
11.1%
t 228
 
9.9%
r 180
 
7.8%
a 148
 
6.4%
s 147
 
6.4%
o 142
 
6.2%
n 136
 
5.9%
p 127
 
5.5%
c 125
 
5.4%
i 123
 
5.4%
Other values (22) 684
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2280
99.3%
Space Separator 10
 
0.4%
Decimal Number 5
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 255
 
11.2%
t 228
 
10.0%
r 180
 
7.9%
a 148
 
6.5%
s 147
 
6.4%
o 142
 
6.2%
n 136
 
6.0%
p 127
 
5.6%
c 125
 
5.5%
i 123
 
5.4%
Other values (16) 669
29.3%
Decimal Number
ValueCountFrequency (%)
5 1
20.0%
3 1
20.0%
4 1
20.0%
1 1
20.0%
2 1
20.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2280
99.3%
Common 15
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 255
 
11.2%
t 228
 
10.0%
r 180
 
7.9%
a 148
 
6.5%
s 147
 
6.4%
o 142
 
6.2%
n 136
 
6.0%
p 127
 
5.6%
c 125
 
5.5%
i 123
 
5.4%
Other values (16) 669
29.3%
Common
ValueCountFrequency (%)
10
66.7%
5 1
 
6.7%
3 1
 
6.7%
4 1
 
6.7%
1 1
 
6.7%
2 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2295
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 255
 
11.1%
t 228
 
9.9%
r 180
 
7.8%
a 148
 
6.4%
s 147
 
6.4%
o 142
 
6.2%
n 136
 
5.9%
p 127
 
5.5%
c 125
 
5.4%
i 123
 
5.4%
Other values (22) 684
29.8%
Distinct484
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T06:26:43.479788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length129
Median length48
Mean length15.392857
Min length1

Characters and Unicode

Total characters7758
Distinct characters530
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique467 ?
Unique (%)92.7%

Sample

1st row적당한,충분한
2nd row서로 비슷한 것
3rd row남이 알아서는 안되는 일
4th row바깥부분, 사외
5th row배경
ValueCountFrequency (%)
또는 34
 
1.6%
어떤 32
 
1.5%
따위를 25
 
1.2%
24
 
1.2%
일정한 21
 
1.0%
19
 
0.9%
따라 19
 
0.9%
일을 15
 
0.7%
있는 14
 
0.7%
사람 11
 
0.5%
Other values (1397) 1862
89.7%
2023-12-13T06:26:44.111490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1576
 
20.3%
, 175
 
2.3%
160
 
2.1%
151
 
1.9%
148
 
1.9%
145
 
1.9%
127
 
1.6%
122
 
1.6%
111
 
1.4%
105
 
1.4%
Other values (520) 4938
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5630
72.6%
Space Separator 1576
 
20.3%
Other Punctuation 307
 
4.0%
Lowercase Letter 84
 
1.1%
Open Punctuation 57
 
0.7%
Close Punctuation 57
 
0.7%
Final Punctuation 15
 
0.2%
Initial Punctuation 15
 
0.2%
Decimal Number 8
 
0.1%
Math Symbol 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
160
 
2.8%
151
 
2.7%
148
 
2.6%
145
 
2.6%
127
 
2.3%
122
 
2.2%
111
 
2.0%
105
 
1.9%
93
 
1.7%
87
 
1.5%
Other values (482) 4381
77.8%
Lowercase Letter
ValueCountFrequency (%)
e 19
22.6%
t 17
20.2%
s 8
9.5%
i 7
 
8.3%
x 7
 
8.3%
n 6
 
7.1%
o 4
 
4.8%
a 4
 
4.8%
r 3
 
3.6%
m 3
 
3.6%
Other values (4) 6
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 175
57.0%
. 86
28.0%
; 31
 
10.1%
· 14
 
4.6%
1
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 2
25.0%
3 2
25.0%
4 2
25.0%
5 1
12.5%
2 1
12.5%
Open Punctuation
ValueCountFrequency (%)
( 47
82.5%
[ 7
 
12.3%
2
 
3.5%
1
 
1.8%
Close Punctuation
ValueCountFrequency (%)
) 47
82.5%
] 7
 
12.3%
2
 
3.5%
1
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
R 1
50.0%
V 1
50.0%
Space Separator
ValueCountFrequency (%)
1576
100.0%
Final Punctuation
ValueCountFrequency (%)
15
100.0%
Initial Punctuation
ValueCountFrequency (%)
15
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5607
72.3%
Common 2042
 
26.3%
Latin 86
 
1.1%
Han 23
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
160
 
2.9%
151
 
2.7%
148
 
2.6%
145
 
2.6%
127
 
2.3%
122
 
2.2%
111
 
2.0%
105
 
1.9%
93
 
1.7%
87
 
1.6%
Other values (462) 4358
77.7%
Common
ValueCountFrequency (%)
1576
77.2%
, 175
 
8.6%
. 86
 
4.2%
( 47
 
2.3%
) 47
 
2.3%
; 31
 
1.5%
15
 
0.7%
15
 
0.7%
· 14
 
0.7%
~ 7
 
0.3%
Other values (12) 29
 
1.4%
Han
ValueCountFrequency (%)
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (10) 10
43.5%
Latin
ValueCountFrequency (%)
e 19
22.1%
t 17
19.8%
s 8
9.3%
i 7
 
8.1%
x 7
 
8.1%
n 6
 
7.0%
o 4
 
4.7%
a 4
 
4.7%
r 3
 
3.5%
m 3
 
3.5%
Other values (6) 8
9.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5607
72.3%
ASCII 2077
 
26.8%
Punctuation 30
 
0.4%
CJK 23
 
0.3%
None 21
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1576
75.9%
, 175
 
8.4%
. 86
 
4.1%
( 47
 
2.3%
) 47
 
2.3%
; 31
 
1.5%
e 19
 
0.9%
t 17
 
0.8%
s 8
 
0.4%
~ 7
 
0.3%
Other values (20) 64
 
3.1%
Hangul
ValueCountFrequency (%)
160
 
2.9%
151
 
2.7%
148
 
2.6%
145
 
2.6%
127
 
2.3%
122
 
2.2%
111
 
2.0%
105
 
1.9%
93
 
1.7%
87
 
1.6%
Other values (462) 4358
77.7%
Punctuation
ValueCountFrequency (%)
15
50.0%
15
50.0%
None
ValueCountFrequency (%)
· 14
66.7%
2
 
9.5%
2
 
9.5%
1
 
4.8%
1
 
4.8%
1
 
4.8%
CJK
ValueCountFrequency (%)
2
 
8.7%
2
 
8.7%
2
 
8.7%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other values (10) 10
43.5%

Missing values

2023-12-13T06:26:39.551981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:26:39.937359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

한글용어영문용어영어약자용어의미
0타당adequateadequate적당한,충분한
1유사resemblancerese서로 비슷한 것
2비밀securityscr남이 알아서는 안되는 일
3외부outerouter바깥부분, 사외
4배경backgroundbg배경
5국가countrycountry나라,국각
6협의discussdiscuss여러 사람이 모여 서로 의논함
7희망wishwish앞일에 대하여 어떤 기대를 가지고 바람
8기관organorg특별한 학술 기예를 장려하는 관직. 기정, 기좌, 기사 따위를 이른다
9사용자useruser사용자
한글용어영문용어영어약자용어의미
494반송returnreturn보낸 물건이나 우편물이 되돌아 옴
495수용receptionrecet거두어들여 씀
496내용contentcont어떤 일의 줄거리가 되는 것
497부적합incongruentincr부조화, 모순, 부적합
498적합fitnessfit꼭 들어맞음
499결재approvalapprv상관이 부하가 제출한 의안을 헤아려 승인함
500기술techniquetech어떤 일을 정확하고 능률적으로 해내는 솜씨
501기술자engineerengr기술에 관한 전문적인 지식과 기능을 가지고 있는 사람
502요약summarysummary말이나 글의 요점을 잡아서 간추림
503요인factorfactor발생원인

Duplicate rows

Most frequently occurring

한글용어영문용어영어약자용어의미# duplicates
4옵션optionopt각종 기기에서 표준 장치 이외에 구입자의 기호에 따라 별도로 선택하여 부착할 수 있는 장치나 부품3
5완성completioncmpt완전히 다 이룸(완성, 완료)3
0발송sendingsend보내다2
1사업부business divisionbizdiv사업부, 사업구분2
2수첩notenote어떤 내용을 기억해 두기 위하여 적음2
3시험testtest재능이나 실력 따위를 일정한 절차에 따라 검사하고 평가하는 일2
6직위job positionjobpost직무에 따라 규정되는 사회적·행정적 위치2