Overview

Dataset statistics

Number of variables6
Number of observations151
Missing cells2
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 KiB
Average record size in memory48.9 B

Variable types

Text4
Categorical1
Boolean1

Dataset

Description(주)한국가스기술공사 기술자료 시스템에 사용되는 기관 표준단어 목록으로 단어명 약어 영문명 단어유형 금칙어 정의등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15103147/fileData.do

Alerts

금칙어여부 has constant value ""Constant

Reproduction

Analysis started2023-12-12 09:40:15.523545
Analysis finished2023-12-12 09:40:16.193540
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct149
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T18:40:16.447599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length4.9139073
Min length1

Characters and Unicode

Total characters742
Distinct characters178
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147 ?
Unique (%)97.4%

Sample

1st row결재
2nd row내용
3rd row설명
4th row첨부파일
5th row파일
ValueCountFrequency (%)
자료 54
 
26.5%
버전 2
 
1.0%
구분 2
 
1.0%
소유자 1
 
0.5%
url 1
 
0.5%
정보-출원지사 1
 
0.5%
데이터 1
 
0.5%
세션 1
 
0.5%
결재 1
 
0.5%
시퀀스 1
 
0.5%
Other values (139) 139
68.1%
2023-12-12T18:40:17.034862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
64
 
8.6%
56
 
7.5%
56
 
7.5%
56
 
7.5%
- 54
 
7.3%
53
 
7.1%
14
 
1.9%
11
 
1.5%
11
 
1.5%
10
 
1.3%
Other values (168) 357
48.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 607
81.8%
Dash Punctuation 54
 
7.3%
Space Separator 53
 
7.1%
Uppercase Letter 27
 
3.6%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
64
 
10.5%
56
 
9.2%
56
 
9.2%
56
 
9.2%
14
 
2.3%
11
 
1.8%
11
 
1.8%
10
 
1.6%
8
 
1.3%
8
 
1.3%
Other values (150) 313
51.6%
Uppercase Letter
ValueCountFrequency (%)
I 4
14.8%
R 3
11.1%
E 3
11.1%
S 2
 
7.4%
O 2
 
7.4%
N 2
 
7.4%
L 2
 
7.4%
D 2
 
7.4%
V 1
 
3.7%
U 1
 
3.7%
Other values (5) 5
18.5%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Space Separator
ValueCountFrequency (%)
53
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 607
81.8%
Common 108
 
14.6%
Latin 27
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
64
 
10.5%
56
 
9.2%
56
 
9.2%
56
 
9.2%
14
 
2.3%
11
 
1.8%
11
 
1.8%
10
 
1.6%
8
 
1.3%
8
 
1.3%
Other values (150) 313
51.6%
Latin
ValueCountFrequency (%)
I 4
14.8%
R 3
11.1%
E 3
11.1%
S 2
 
7.4%
O 2
 
7.4%
N 2
 
7.4%
L 2
 
7.4%
D 2
 
7.4%
V 1
 
3.7%
U 1
 
3.7%
Other values (5) 5
18.5%
Common
ValueCountFrequency (%)
- 54
50.0%
53
49.1%
. 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 607
81.8%
ASCII 135
 
18.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
64
 
10.5%
56
 
9.2%
56
 
9.2%
56
 
9.2%
14
 
2.3%
11
 
1.8%
11
 
1.8%
10
 
1.6%
8
 
1.3%
8
 
1.3%
Other values (150) 313
51.6%
ASCII
ValueCountFrequency (%)
- 54
40.0%
53
39.3%
I 4
 
3.0%
R 3
 
2.2%
E 3
 
2.2%
S 2
 
1.5%
O 2
 
1.5%
N 2
 
1.5%
L 2
 
1.5%
D 2
 
1.5%
Other values (8) 8
 
5.9%

약어
Text

Distinct141
Distinct (%)94.0%
Missing1
Missing (%)0.7%
Memory size1.3 KiB
2023-12-12T18:40:17.427856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length6.44
Min length2

Characters and Unicode

Total characters966
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)88.0%

Sample

1st rowAPP
2nd rowDESC
3rd rowDESC
4th rowFILE
5th rowFILE
ValueCountFrequency (%)
upd 2
 
1.3%
desc 2
 
1.3%
version 2
 
1.3%
reg 2
 
1.3%
registdate 2
 
1.3%
caution 2
 
1.3%
revision 2
 
1.3%
file 2
 
1.3%
org 2
 
1.3%
link 1
 
0.7%
Other values (131) 131
87.3%
2023-12-12T18:40:18.079684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 124
12.8%
T 83
 
8.6%
A 79
 
8.2%
N 74
 
7.7%
R 71
 
7.3%
I 68
 
7.0%
O 64
 
6.6%
D 59
 
6.1%
S 47
 
4.9%
U 41
 
4.2%
Other values (18) 256
26.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 964
99.8%
Other Letter 2
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 124
12.9%
T 83
 
8.6%
A 79
 
8.2%
N 74
 
7.7%
R 71
 
7.4%
I 68
 
7.1%
O 64
 
6.6%
D 59
 
6.1%
S 47
 
4.9%
U 41
 
4.3%
Other values (16) 254
26.3%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 964
99.8%
Hangul 2
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 124
12.9%
T 83
 
8.6%
A 79
 
8.2%
N 74
 
7.7%
R 71
 
7.4%
I 68
 
7.1%
O 64
 
6.6%
D 59
 
6.1%
S 47
 
4.9%
U 41
 
4.3%
Other values (16) 254
26.3%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 964
99.8%
Hangul 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 124
12.9%
T 83
 
8.6%
A 79
 
8.2%
N 74
 
7.7%
R 71
 
7.4%
I 68
 
7.1%
O 64
 
6.6%
D 59
 
6.1%
S 47
 
4.9%
U 41
 
4.3%
Other values (16) 254
26.3%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct141
Distinct (%)94.0%
Missing1
Missing (%)0.7%
Memory size1.3 KiB
2023-12-12T18:40:18.500196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length7.1333333
Min length2

Characters and Unicode

Total characters1070
Distinct characters42
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique133 ?
Unique (%)88.7%

Sample

1st rowAPP
2nd rowDESC
3rd rowDESC
4th rowFILE
5th rowFILE
ValueCountFrequency (%)
version 3
 
1.9%
date 3
 
1.9%
desc 2
 
1.3%
reg 2
 
1.3%
level 2
 
1.3%
caution 2
 
1.3%
revision 2
 
1.3%
file 2
 
1.3%
answer 2
 
1.3%
number 2
 
1.3%
Other values (134) 134
85.9%
2023-12-12T18:40:19.113358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 139
13.0%
T 92
 
8.6%
A 89
 
8.3%
N 81
 
7.6%
R 76
 
7.1%
I 74
 
6.9%
O 73
 
6.8%
D 59
 
5.5%
S 54
 
5.0%
U 44
 
4.1%
Other values (32) 289
27.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1045
97.7%
Lowercase Letter 17
 
1.6%
Space Separator 6
 
0.6%
Other Letter 2
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 139
13.3%
T 92
 
8.8%
A 89
 
8.5%
N 81
 
7.8%
R 76
 
7.3%
I 74
 
7.1%
O 73
 
7.0%
D 59
 
5.6%
S 54
 
5.2%
U 44
 
4.2%
Other values (16) 264
25.3%
Lowercase Letter
ValueCountFrequency (%)
n 3
17.6%
s 2
11.8%
i 2
11.8%
a 1
 
5.9%
m 1
 
5.9%
l 1
 
5.9%
u 1
 
5.9%
t 1
 
5.9%
o 1
 
5.9%
d 1
 
5.9%
Other values (3) 3
17.6%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1062
99.3%
Common 6
 
0.6%
Hangul 2
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 139
13.1%
T 92
 
8.7%
A 89
 
8.4%
N 81
 
7.6%
R 76
 
7.2%
I 74
 
7.0%
O 73
 
6.9%
D 59
 
5.6%
S 54
 
5.1%
U 44
 
4.1%
Other values (29) 281
26.5%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Common
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1068
99.8%
Hangul 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 139
13.0%
T 92
 
8.6%
A 89
 
8.3%
N 81
 
7.6%
R 76
 
7.1%
I 74
 
6.9%
O 73
 
6.8%
D 59
 
5.5%
S 54
 
5.1%
U 44
 
4.1%
Other values (30) 287
26.9%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

단어유형
Categorical

Distinct2
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
수식어
123 
분류어
28 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수식어
2nd row분류어
3rd row분류어
4th row분류어
5th row분류어

Common Values

ValueCountFrequency (%)
수식어 123
81.5%
분류어 28
 
18.5%

Length

2023-12-12T18:40:19.276747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:40:19.402986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수식어 123
81.5%
분류어 28
 
18.5%

금칙어여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size283.0 B
False
151 
ValueCountFrequency (%)
False 151
100.0%
2023-12-12T18:40:19.523959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

정의
Text

Distinct148
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T18:40:19.815518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length26
Mean length6.3774834
Min length1

Characters and Unicode

Total characters963
Distinct characters221
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)96.0%

Sample

1st row결재
2nd row내용
3rd row설명
4th row첨부파일
5th row파일
ValueCountFrequency (%)
자료 55
 
21.6%
삭제 2
 
0.8%
작업을 2
 
0.8%
사람 2
 
0.8%
하는 2
 
0.8%
주소 2
 
0.8%
순서 2
 
0.8%
버전 2
 
0.8%
등급 2
 
0.8%
정보-참가분야 1
 
0.4%
Other values (183) 183
71.8%
2023-12-12T18:40:20.361207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
106
 
11.0%
67
 
7.0%
59
 
6.1%
58
 
6.0%
- 57
 
5.9%
56
 
5.8%
16
 
1.7%
12
 
1.2%
11
 
1.1%
11
 
1.1%
Other values (211) 510
53.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 769
79.9%
Space Separator 106
 
11.0%
Dash Punctuation 57
 
5.9%
Uppercase Letter 28
 
2.9%
Other Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
67
 
8.7%
59
 
7.7%
58
 
7.5%
56
 
7.3%
16
 
2.1%
12
 
1.6%
11
 
1.4%
11
 
1.4%
11
 
1.4%
9
 
1.2%
Other values (192) 459
59.7%
Uppercase Letter
ValueCountFrequency (%)
I 4
14.3%
Y 4
14.3%
D 3
10.7%
M 2
 
7.1%
R 2
 
7.1%
S 2
 
7.1%
N 2
 
7.1%
O 2
 
7.1%
A 1
 
3.6%
C 1
 
3.6%
Other values (5) 5
17.9%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
106
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 57
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 769
79.9%
Common 166
 
17.2%
Latin 28
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
67
 
8.7%
59
 
7.7%
58
 
7.5%
56
 
7.3%
16
 
2.1%
12
 
1.6%
11
 
1.4%
11
 
1.4%
11
 
1.4%
9
 
1.2%
Other values (192) 459
59.7%
Latin
ValueCountFrequency (%)
I 4
14.3%
Y 4
14.3%
D 3
10.7%
M 2
 
7.1%
R 2
 
7.1%
S 2
 
7.1%
N 2
 
7.1%
O 2
 
7.1%
A 1
 
3.6%
C 1
 
3.6%
Other values (5) 5
17.9%
Common
ValueCountFrequency (%)
106
63.9%
- 57
34.3%
, 2
 
1.2%
. 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 769
79.9%
ASCII 194
 
20.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
106
54.6%
- 57
29.4%
I 4
 
2.1%
Y 4
 
2.1%
D 3
 
1.5%
M 2
 
1.0%
, 2
 
1.0%
R 2
 
1.0%
S 2
 
1.0%
N 2
 
1.0%
Other values (9) 10
 
5.2%
Hangul
ValueCountFrequency (%)
67
 
8.7%
59
 
7.7%
58
 
7.5%
56
 
7.3%
16
 
2.1%
12
 
1.6%
11
 
1.4%
11
 
1.4%
11
 
1.4%
9
 
1.2%
Other values (192) 459
59.7%

Missing values

2023-12-12T18:40:15.891793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:40:16.033368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T18:40:16.130404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단어명약어영문명단어유형금칙어여부정의
0결재APPAPP수식어N결재
1내용DESCDESC분류어N내용
2설명DESCDESC분류어N설명
3첨부파일FILEFILE분류어N첨부파일
4파일FILEFILE분류어N파일
5구분GUBUNGUBUN수식어N구분
6TRXTRXTRX수식어NTRX
7공통COMMCOMM수식어N공통
8제목TITLETITLE분류어N제목
9기본값DEFAULTDEFAULT분류어N기본값
단어명약어영문명단어유형금칙어여부정의
141읽기전용READONLYREADONLY수식어N읽을수만 있음
142버전VERVERSION수식어N소프트웨어 따위가 몇번 개정되었는지 나타냄
143미리보기PREVPREVIEW수식어N사전에 봄
144마스타MASTERMASTER수식어N마스타
145URLURLURL수식어N인터넷 주소
146소유자OWNEROWNER수식어N소유하는 사람
147넘버NUMBERNUMBER분류어N숫자값
148레벨LVLLEVEL수식어N등급
149COUNTCOUNT분류어N
150DELDELETEDEL수식어N삭제