Overview

Dataset statistics

Number of variables6
Number of observations263
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.4%
Total size in memory12.5 KiB
Average record size in memory48.5 B

Variable types

Text4
Categorical1
Boolean1

Dataset

Description(주)한국가스기술공사 홈페이지 시스템에 사용되는 기관표준단어 목록으로 단어명 약어 영문명 단어유형 금칙어 정의 등의 항목을 제공합니다
URLhttps://www.data.go.kr/data/15103151/fileData.do

Alerts

금칙어여부 has constant value ""Constant
Dataset has 1 (0.4%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 05:03:27.135431
Analysis finished2023-12-12 05:03:27.610805
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct234
Distinct (%)89.0%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
2023-12-12T14:03:27.881347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.730038
Min length1

Characters and Unicode

Total characters718
Distinct characters242
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)79.5%

Sample

1st row비주얼
2nd row적용대상
3rd row접수지사
4th row컬럼
5th row상세
ValueCountFrequency (%)
내용 4
 
1.5%
종료 3
 
1.1%
구분 3
 
1.1%
3
 
1.1%
답글 3
 
1.1%
설명 2
 
0.8%
가능 2
 
0.8%
번호 2
 
0.8%
비밀글 2
 
0.8%
시작 2
 
0.8%
Other values (223) 238
90.2%
2023-12-12T14:03:28.345534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
2.8%
19
 
2.6%
16
 
2.2%
15
 
2.1%
13
 
1.8%
13
 
1.8%
12
 
1.7%
11
 
1.5%
11
 
1.5%
10
 
1.4%
Other values (232) 578
80.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 697
97.1%
Uppercase Letter 20
 
2.8%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
2.9%
19
 
2.7%
16
 
2.3%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (218) 557
79.9%
Uppercase Letter
ValueCountFrequency (%)
C 4
20.0%
S 3
15.0%
L 2
10.0%
I 2
10.0%
M 1
 
5.0%
U 1
 
5.0%
R 1
 
5.0%
P 1
 
5.0%
A 1
 
5.0%
N 1
 
5.0%
Other values (3) 3
15.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 697
97.1%
Latin 20
 
2.8%
Common 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
2.9%
19
 
2.7%
16
 
2.3%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (218) 557
79.9%
Latin
ValueCountFrequency (%)
C 4
20.0%
S 3
15.0%
L 2
10.0%
I 2
10.0%
M 1
 
5.0%
U 1
 
5.0%
R 1
 
5.0%
P 1
 
5.0%
A 1
 
5.0%
N 1
 
5.0%
Other values (3) 3
15.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 697
97.1%
ASCII 21
 
2.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
 
2.9%
19
 
2.7%
16
 
2.3%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (218) 557
79.9%
ASCII
ValueCountFrequency (%)
C 4
19.0%
S 3
14.3%
L 2
9.5%
I 2
9.5%
M 1
 
4.8%
U 1
 
4.8%
1
 
4.8%
R 1
 
4.8%
P 1
 
4.8%
A 1
 
4.8%
Other values (4) 4
19.0%

약어
Text

Distinct246
Distinct (%)93.5%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
2023-12-12T14:03:28.718257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length5.1596958
Min length1

Characters and Unicode

Total characters1357
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique229 ?
Unique (%)87.1%

Sample

1st rowVISUAL
2nd rowASSIGN
3rd rowBRANCH
4th rowCOLUMN
5th rowDETAIL
ValueCountFrequency (%)
money 2
 
0.8%
type 2
 
0.8%
reply 2
 
0.8%
private 2
 
0.8%
pidx 2
 
0.8%
dt 2
 
0.8%
policy 2
 
0.8%
order 2
 
0.8%
survey 2
 
0.8%
path 2
 
0.8%
Other values (236) 243
92.4%
2023-12-12T14:03:29.270417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 149
 
11.0%
T 118
 
8.7%
I 111
 
8.2%
O 101
 
7.4%
R 96
 
7.1%
N 94
 
6.9%
A 91
 
6.7%
S 69
 
5.1%
P 67
 
4.9%
D 64
 
4.7%
Other values (16) 397
29.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1357
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 149
 
11.0%
T 118
 
8.7%
I 111
 
8.2%
O 101
 
7.4%
R 96
 
7.1%
N 94
 
6.9%
A 91
 
6.7%
S 69
 
5.1%
P 67
 
4.9%
D 64
 
4.7%
Other values (16) 397
29.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1357
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 149
 
11.0%
T 118
 
8.7%
I 111
 
8.2%
O 101
 
7.4%
R 96
 
7.1%
N 94
 
6.9%
A 91
 
6.7%
S 69
 
5.1%
P 67
 
4.9%
D 64
 
4.7%
Other values (16) 397
29.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1357
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 149
 
11.0%
T 118
 
8.7%
I 111
 
8.2%
O 101
 
7.4%
R 96
 
7.1%
N 94
 
6.9%
A 91
 
6.7%
S 69
 
5.1%
P 67
 
4.9%
D 64
 
4.7%
Other values (16) 397
29.3%
Distinct247
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
2023-12-12T14:03:29.723214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length5.1901141
Min length1

Characters and Unicode

Total characters1365
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique231 ?
Unique (%)87.8%

Sample

1st rowVISUAL
2nd rowASSIGN
3rd rowBRANCH
4th rowCOLUMN
5th rowDETAIL
ValueCountFrequency (%)
show 2
 
0.8%
name 2
 
0.8%
project 2
 
0.8%
promise 2
 
0.8%
code 2
 
0.8%
survey 2
 
0.8%
work 2
 
0.8%
path 2
 
0.8%
policy 2
 
0.8%
text 2
 
0.8%
Other values (236) 245
92.5%
2023-12-12T14:03:30.326484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 150
 
11.0%
T 119
 
8.7%
I 111
 
8.1%
O 102
 
7.5%
R 97
 
7.1%
N 93
 
6.8%
A 91
 
6.7%
S 69
 
5.1%
P 68
 
5.0%
C 64
 
4.7%
Other values (17) 401
29.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1363
99.9%
Space Separator 2
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 150
 
11.0%
T 119
 
8.7%
I 111
 
8.1%
O 102
 
7.5%
R 97
 
7.1%
N 93
 
6.8%
A 91
 
6.7%
S 69
 
5.1%
P 68
 
5.0%
C 64
 
4.7%
Other values (16) 399
29.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1363
99.9%
Common 2
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 150
 
11.0%
T 119
 
8.7%
I 111
 
8.1%
O 102
 
7.5%
R 97
 
7.1%
N 93
 
6.8%
A 91
 
6.7%
S 69
 
5.1%
P 68
 
5.0%
C 64
 
4.7%
Other values (16) 399
29.3%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1365
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 150
 
11.0%
T 119
 
8.7%
I 111
 
8.1%
O 102
 
7.5%
R 97
 
7.1%
N 93
 
6.8%
A 91
 
6.7%
S 69
 
5.1%
P 68
 
5.0%
C 64
 
4.7%
Other values (17) 401
29.4%

단어유형
Categorical

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
수식어
168 
분류어
95 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수식어
2nd row수식어
3rd row분류어
4th row수식어
5th row분류어

Common Values

ValueCountFrequency (%)
수식어 168
63.9%
분류어 95
36.1%

Length

2023-12-12T14:03:30.477743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:03:30.584275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수식어 168
63.9%
분류어 95
36.1%

금칙어여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size395.0 B
False
263 
ValueCountFrequency (%)
False 263
100.0%
2023-12-12T14:03:30.694113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

정의
Text

Distinct233
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
2023-12-12T14:03:31.054697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.7262357
Min length1

Characters and Unicode

Total characters717
Distinct characters240
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)78.7%

Sample

1st row비주얼
2nd row적용대상
3rd row접수지사
4th row컬럼
5th row상세
ValueCountFrequency (%)
내용 4
 
1.5%
구분 3
 
1.1%
답글 3
 
1.1%
3
 
1.1%
종료 3
 
1.1%
비밀글 2
 
0.8%
장소 2
 
0.8%
프로젝트 2
 
0.8%
코드 2
 
0.8%
부서 2
 
0.8%
Other values (222) 239
90.2%
2023-12-12T14:03:31.654861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
2.6%
18
 
2.5%
15
 
2.1%
15
 
2.1%
13
 
1.8%
13
 
1.8%
12
 
1.7%
11
 
1.5%
11
 
1.5%
10
 
1.4%
Other values (230) 580
80.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 695
96.9%
Uppercase Letter 20
 
2.8%
Space Separator 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
2.7%
18
 
2.6%
15
 
2.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (216) 558
80.3%
Uppercase Letter
ValueCountFrequency (%)
C 4
20.0%
S 3
15.0%
L 2
10.0%
I 2
10.0%
A 1
 
5.0%
Q 1
 
5.0%
N 1
 
5.0%
U 1
 
5.0%
R 1
 
5.0%
P 1
 
5.0%
Other values (3) 3
15.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 695
96.9%
Latin 20
 
2.8%
Common 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
2.7%
18
 
2.6%
15
 
2.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (216) 558
80.3%
Latin
ValueCountFrequency (%)
C 4
20.0%
S 3
15.0%
L 2
10.0%
I 2
10.0%
A 1
 
5.0%
Q 1
 
5.0%
N 1
 
5.0%
U 1
 
5.0%
R 1
 
5.0%
P 1
 
5.0%
Other values (3) 3
15.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 695
96.9%
ASCII 22
 
3.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
19
 
2.7%
18
 
2.6%
15
 
2.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
12
 
1.7%
11
 
1.6%
11
 
1.6%
10
 
1.4%
Other values (216) 558
80.3%
ASCII
ValueCountFrequency (%)
C 4
18.2%
S 3
13.6%
2
9.1%
L 2
9.1%
I 2
9.1%
A 1
 
4.5%
Q 1
 
4.5%
N 1
 
4.5%
U 1
 
4.5%
R 1
 
4.5%
Other values (4) 4
18.2%

Missing values

2023-12-12T14:03:27.456216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:03:27.571125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

단어명약어영문명단어유형금칙어여부정의
0비주얼VISUALVISUAL수식어N비주얼
1적용대상ASSIGNASSIGN수식어N적용대상
2접수지사BRANCHBRANCH분류어N접수지사
3컬럼COLUMNCOLUMN수식어N컬럼
4상세DETAILDETAIL분류어N상세
5직원EMPEMP수식어N직원
6종료FINISHFINISH수식어N종료
7종료TOTO수식어N종료
8종료ENDEND수식어N종료
9준공일FINALDAYFINALDAY분류어N준공일
단어명약어영문명단어유형금칙어여부정의
253출력SHOWSHOW수식어N출력
254출력OUPUTOUPUT수식어N출력
255부모회원PARENTPARENT수식어N부모회원
256저장경로PATHPATH수식어N저장경로
257게시PRESSPRESS수식어N게시
258분야REALMREALM분류어N분야
259리턴RETURNRETURN수식어N리턴
260스크롤SCROLLSCROLL수식어N스크롤
261T코드TCDTCD수식어NT코드
262유형TYPETYPE분류어N유형

Duplicate rows

Most frequently occurring

단어명약어영문명단어유형금칙어여부정의# duplicates
0DTDT분류어N2