Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.0 KiB
Average record size in memory56.7 B

Variable types

Text2
Categorical5

Dataset

Description(주)한국가스기술공사 기술자료 시스템에 사용되는 기관표준용어 목록으로 용어명 물리명 도메인 인포타입 데이터타입 정의 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15103148/fileData.do

Alerts

개인정보 유형 has constant value ""Constant
공개/비공개여부 has constant value ""Constant
인포타입 is highly overall correlated with 도메인 and 1 other fieldsHigh correlation
데이터타입 is highly overall correlated with 도메인 and 1 other fieldsHigh correlation
도메인 is highly overall correlated with 인포타입 and 1 other fieldsHigh correlation
용어명 has unique valuesUnique
물리명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:24:09.475525
Analysis finished2023-12-12 05:24:10.111689
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

용어명
Text

UNIQUE 

Distinct199
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T14:24:10.368137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length4.8743719
Min length2

Characters and Unicode

Total characters970
Distinct characters174
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)100.0%

Sample

1st row결재내용
2nd row결재일시
3rd row등록일시
4th row변경일시
5th row로그일시
ValueCountFrequency (%)
결재내용 1
 
0.5%
원본첨부파일명 1
 
0.5%
규격명 1
 
0.5%
도면번호 1
 
0.5%
대상설비 1
 
0.5%
설비명 1
 
0.5%
물질명 1
 
0.5%
프로젝트명 1
 
0.5%
발행기관 1
 
0.5%
내용 1
 
0.5%
Other values (189) 189
95.0%
2023-12-12T14:24:10.858236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
 
3.9%
36
 
3.7%
34
 
3.5%
30
 
3.1%
28
 
2.9%
28
 
2.9%
26
 
2.7%
퀀 26
 
2.7%
25
 
2.6%
20
 
2.1%
Other values (164) 679
70.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 927
95.6%
Uppercase Letter 43
 
4.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
4.1%
36
 
3.9%
34
 
3.7%
30
 
3.2%
28
 
3.0%
28
 
3.0%
26
 
2.8%
퀀 26
 
2.8%
25
 
2.7%
20
 
2.2%
Other values (152) 636
68.6%
Uppercase Letter
ValueCountFrequency (%)
I 15
34.9%
D 12
27.9%
P 4
 
9.3%
R 2
 
4.7%
E 2
 
4.7%
L 2
 
4.7%
X 1
 
2.3%
T 1
 
2.3%
C 1
 
2.3%
A 1
 
2.3%
Other values (2) 2
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 927
95.6%
Latin 43
 
4.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
4.1%
36
 
3.9%
34
 
3.7%
30
 
3.2%
28
 
3.0%
28
 
3.0%
26
 
2.8%
퀀 26
 
2.8%
25
 
2.7%
20
 
2.2%
Other values (152) 636
68.6%
Latin
ValueCountFrequency (%)
I 15
34.9%
D 12
27.9%
P 4
 
9.3%
R 2
 
4.7%
E 2
 
4.7%
L 2
 
4.7%
X 1
 
2.3%
T 1
 
2.3%
C 1
 
2.3%
A 1
 
2.3%
Other values (2) 2
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 927
95.6%
ASCII 43
 
4.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
38
 
4.1%
36
 
3.9%
34
 
3.7%
30
 
3.2%
28
 
3.0%
28
 
3.0%
26
 
2.8%
퀀 26
 
2.8%
25
 
2.7%
20
 
2.2%
Other values (152) 636
68.6%
ASCII
ValueCountFrequency (%)
I 15
34.9%
D 12
27.9%
P 4
 
9.3%
R 2
 
4.7%
E 2
 
4.7%
L 2
 
4.7%
X 1
 
2.3%
T 1
 
2.3%
C 1
 
2.3%
A 1
 
2.3%
Other values (2) 2
 
4.7%

물리명
Text

UNIQUE 

Distinct199
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T14:24:11.221339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length9.7638191
Min length2

Characters and Unicode

Total characters1943
Distinct characters42
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)100.0%

Sample

1st rowAPP_DESC
2nd rowAPP_DT
3rd rowREG_DT
4th rowUPD_DT
5th rowLOG_DT
ValueCountFrequency (%)
app_desc 1
 
0.5%
org_file_nm 1
 
0.5%
standardname 1
 
0.5%
drawingnumber 1
 
0.5%
equipment 1
 
0.5%
equipname 1
 
0.5%
materialname 1
 
0.5%
projectname 1
 
0.5%
publisher 1
 
0.5%
내용 1
 
0.5%
Other values (189) 189
95.0%
2023-12-12T14:24:11.771160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 211
 
10.9%
_ 193
 
9.9%
T 127
 
6.5%
D 118
 
6.1%
A 117
 
6.0%
O 115
 
5.9%
N 112
 
5.8%
R 109
 
5.6%
S 104
 
5.4%
M 99
 
5.1%
Other values (32) 638
32.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1734
89.2%
Connector Punctuation 193
 
9.9%
Other Letter 16
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 211
12.2%
T 127
 
7.3%
D 118
 
6.8%
A 117
 
6.7%
O 115
 
6.6%
N 112
 
6.5%
R 109
 
6.3%
S 104
 
6.0%
M 99
 
5.7%
C 97
 
5.6%
Other values (16) 525
30.3%
Other Letter
ValueCountFrequency (%)
2
 
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (5) 5
31.2%
Connector Punctuation
ValueCountFrequency (%)
_ 193
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1734
89.2%
Common 193
 
9.9%
Hangul 16
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 211
12.2%
T 127
 
7.3%
D 118
 
6.8%
A 117
 
6.7%
O 115
 
6.6%
N 112
 
6.5%
R 109
 
6.3%
S 104
 
6.0%
M 99
 
5.7%
C 97
 
5.6%
Other values (16) 525
30.3%
Hangul
ValueCountFrequency (%)
2
 
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (5) 5
31.2%
Common
ValueCountFrequency (%)
_ 193
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1927
99.2%
Hangul 16
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 211
 
10.9%
_ 193
 
10.0%
T 127
 
6.6%
D 118
 
6.1%
A 117
 
6.1%
O 115
 
6.0%
N 112
 
5.8%
R 109
 
5.7%
S 104
 
5.4%
M 99
 
5.1%
Other values (17) 622
32.3%
Hangul
ValueCountFrequency (%)
2
 
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (5) 5
31.2%

도메인
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
59 
시퀀스
26 
코드
22 
설명
15 
여부
12 
Other values (25)
65 

Length

Max length4
Median length2
Mean length1.8291457
Min length1

Unique

Unique12 ?
Unique (%)6.0%

Sample

1st row내용
2nd row일시
3rd row일시
4th row일시
5th row일시

Common Values

ValueCountFrequency (%)
59
29.6%
시퀀스 26
13.1%
코드 22
 
11.1%
설명 15
 
7.5%
여부 12
 
6.0%
ID 11
 
5.5%
일자 9
 
4.5%
내용 5
 
2.5%
일시 4
 
2.0%
순서 4
 
2.0%
Other values (20) 32
16.1%

Length

2023-12-12T14:24:12.300264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
59
29.6%
시퀀스 26
13.1%
코드 22
 
11.1%
설명 15
 
7.5%
여부 12
 
6.0%
id 11
 
5.5%
일자 9
 
4.5%
내용 5
 
2.5%
일시 4
 
2.0%
순서 4
 
2.0%
Other values (20) 32
16.1%

인포타입
Categorical

HIGH CORRELATION 

Distinct49
Distinct (%)24.6%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
시퀀스_IN
26 
명_VC32
24 
코드_VC4
20 
명_VC100
20 
여부_VC1
12 
Other values (44)
97 

Length

Max length9
Median length6
Mean length6.5979899
Min length4

Unique

Unique28 ?
Unique (%)14.1%

Sample

1st row내용_VC1000
2nd row일시_DT
3rd row일시_DT
4th row일시_DT
5th row일시_DT

Common Values

ValueCountFrequency (%)
시퀀스_IN 26
13.1%
명_VC32 24
12.1%
코드_VC4 20
 
10.1%
명_VC100 20
 
10.1%
여부_VC1 12
 
6.0%
설명_VC1000 11
 
5.5%
일자_VC8 9
 
4.5%
ID_VC20 8
 
4.0%
명_VC64 6
 
3.0%
순서_IN 4
 
2.0%
Other values (39) 59
29.6%

Length

2023-12-12T14:24:12.473955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
시퀀스_in 26
13.1%
명_vc32 24
12.1%
코드_vc4 20
 
10.1%
명_vc100 20
 
10.1%
여부_vc1 12
 
6.0%
설명_vc1000 11
 
5.5%
일자_vc8 9
 
4.5%
id_vc20 8
 
4.0%
명_vc64 6
 
3.0%
명_vc128 4
 
2.0%
Other values (39) 59
29.6%

데이터타입
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
INTEGER
37 
VARCHAR(100)
31 
VARCHAR(32)
24 
VARCHAR(4)
21 
VARCHAR(1000)
18 
Other values (13)
68 

Length

Max length13
Median length11
Mean length10.201005
Min length4

Unique

Unique3 ?
Unique (%)1.5%

Sample

1st rowVARCHAR(1000)
2nd rowDATE
3rd rowDATE
4th rowDATE
5th rowDATE

Common Values

ValueCountFrequency (%)
INTEGER 37
18.6%
VARCHAR(100) 31
15.6%
VARCHAR(32) 24
12.1%
VARCHAR(4) 21
10.6%
VARCHAR(1000) 18
9.0%
VARCHAR(20) 12
 
6.0%
VARCHAR(1) 12
 
6.0%
VARCHAR(8) 11
 
5.5%
VARCHAR(4000) 8
 
4.0%
VARCHAR(64) 6
 
3.0%
Other values (8) 19
9.5%

Length

2023-12-12T14:24:12.633013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
integer 37
18.6%
varchar(100 31
15.6%
varchar(32 24
12.1%
varchar(4 21
10.6%
varchar(1000 18
9.0%
varchar(20 12
 
6.0%
varchar(1 12
 
6.0%
varchar(8 11
 
5.5%
varchar(4000 8
 
4.0%
varchar(64 6
 
3.0%
Other values (8) 19
9.5%

개인정보 유형
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
해당없음
199 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row해당없음
2nd row해당없음
3rd row해당없음
4th row해당없음
5th row해당없음

Common Values

ValueCountFrequency (%)
해당없음 199
100.0%

Length

2023-12-12T14:24:12.762545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:24:12.870077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
해당없음 199
100.0%

공개/비공개여부
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
공개
199 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공개
2nd row공개
3rd row공개
4th row공개
5th row공개

Common Values

ValueCountFrequency (%)
공개 199
100.0%

Length

2023-12-12T14:24:13.008863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:24:13.125288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공개 199
100.0%

Correlations

2023-12-12T14:24:13.189053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인인포타입데이터타입
도메인1.0001.0000.953
인포타입1.0001.0001.000
데이터타입0.9531.0001.000
2023-12-12T14:24:13.286386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인포타입데이터타입도메인
인포타입1.0000.9100.942
데이터타입0.9101.0000.621
도메인0.9420.6211.000
2023-12-12T14:24:13.379191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도메인인포타입데이터타입
도메인1.0000.9420.621
인포타입0.9421.0000.910
데이터타입0.6210.9101.000

Missing values

2023-12-12T14:24:09.869507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:24:10.056937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

용어명물리명도메인인포타입데이터타입개인정보 유형공개/비공개여부
0결재내용APP_DESC내용내용_VC1000VARCHAR(1000)해당없음공개
1결재일시APP_DT일시일시_DTDATE해당없음공개
2등록일시REG_DT일시일시_DTDATE해당없음공개
3변경일시UPD_DT일시일시_DTDATE해당없음공개
4로그일시LOG_DT일시일시_DTDATE해당없음공개
5자료상태공통코드OBJ_STATE_COMM_CD코드코드_VC4VARCHAR(4)해당없음공개
6콤보유형공통코드COMBO_TYPE_COMM_CD코드코드_VC4VARCHAR(4)해당없음공개
7그룹유형공통코드GRP_TYPE_COMM_CD코드코드_VC4VARCHAR(4)해당없음공개
8자료유형공통코드OBJ_TYPE_COMM_CD코드코드_VC4VARCHAR(4)해당없음공개
9역할공통코드ROLE_COMM_CD코드코드_VC4VARCHAR(4)해당없음공개
용어명물리명도메인인포타입데이터타입개인정보 유형공개/비공개여부
189댓글답변시퀀스QNA_ANSWER_SEQ시퀀스시퀀스_ININTEGER해당없음공개
190버전시퀀스VER_SEQ시퀀스시퀀스_ININTEGER해당없음공개
191자료시퀀스OBJ_SEQ시퀀스시퀀스_ININTEGER해당없음공개
192자료첨부파일시퀀스OBJ_FILE_SEQ시퀀스시퀀스_ININTEGER해당없음공개
193도면첨부파일시퀀스CLS_FILE_SEQ시퀀스시퀀스_ININTEGER해당없음공개
194도면시퀀스CLS_SEQ시퀀스시퀀스_ININTEGER해당없음공개
195자료넘버OBJ_NUMBER번호번호_VC1000VARCHAR(1000)해당없음공개
196첨부파일URL유형FILE_URL_TYPE유형유형_VC4VARCHAR(4)해당없음공개
197구분DIVISION유형유형_VC100VARCHAR(100)해당없음공개
198부서코드DEPTCODE코드코드_VC6VARCHAR(6)해당없음공개