Overview

Dataset statistics

Number of variables3
Number of observations503
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.9 KiB
Average record size in memory24.3 B

Variable types

Text2
Categorical1

Dataset

Description한국가스기술공사 홈페이지 주요부서정보와, 관련된 홈페이지 정보 메뉴코드 , 홈페이지 등록일, 홈페이지 정보 수정일, 홈페이지 등록자 등의 내용을 공개하고자 합니다.
Author(주)한국가스기술공사
URLhttps://www.data.go.kr/data/15091328/fileData.do

Alerts

등록아이디 is highly imbalanced (87.8%)Imbalance
부서코드 has unique valuesUnique

Reproduction

Analysis started2023-12-12 16:00:55.743155
Analysis finished2023-12-12 16:00:56.012226
Duration0.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

부서코드
Text

UNIQUE 

Distinct503
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T01:00:56.207590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4024
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique503 ?
Unique (%)100.0%

Sample

1st rowD0000003
2nd rowD0000004
3rd rowD0001406
4th rowD0000940
5th rowD0000984
ValueCountFrequency (%)
d0000003 1
 
0.2%
d0000755 1
 
0.2%
d0001010 1
 
0.2%
d0000433 1
 
0.2%
d0000396 1
 
0.2%
d0001620 1
 
0.2%
d0001496 1
 
0.2%
d0001489 1
 
0.2%
d0001181 1
 
0.2%
d0001394 1
 
0.2%
Other values (493) 493
98.0%
2023-12-13T01:00:56.576896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1807
44.9%
1 511
 
12.7%
D 503
 
12.5%
4 190
 
4.7%
2 181
 
4.5%
3 167
 
4.2%
9 156
 
3.9%
5 153
 
3.8%
7 121
 
3.0%
6 118
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3521
87.5%
Uppercase Letter 503
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1807
51.3%
1 511
 
14.5%
4 190
 
5.4%
2 181
 
5.1%
3 167
 
4.7%
9 156
 
4.4%
5 153
 
4.3%
7 121
 
3.4%
6 118
 
3.4%
8 117
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
D 503
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3521
87.5%
Latin 503
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1807
51.3%
1 511
 
14.5%
4 190
 
5.4%
2 181
 
5.1%
3 167
 
4.7%
9 156
 
4.4%
5 153
 
4.3%
7 121
 
3.4%
6 118
 
3.4%
8 117
 
3.3%
Latin
ValueCountFrequency (%)
D 503
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1807
44.9%
1 511
 
12.7%
D 503
 
12.5%
4 190
 
4.7%
2 181
 
4.5%
3 167
 
4.2%
9 156
 
3.9%
5 153
 
3.8%
7 121
 
3.0%
6 118
 
2.9%
Distinct242
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T01:00:56.755975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length5.0934394
Min length2

Characters and Unicode

Total characters2562
Distinct characters233
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique184 ?
Unique (%)36.6%

Sample

1st row사장
2nd row감사
3rd row경영전략본부
4th row정비사업본부
5th row에너지사업본부
ValueCountFrequency (%)
공무 27
 
5.2%
지사부 14
 
2.7%
안전공무팀 14
 
2.7%
업무지원팀 11
 
2.1%
관로정비부 10
 
1.9%
방식반 10
 
1.9%
계전파트 10
 
1.9%
기계파트 10
 
1.9%
기전부 10
 
1.9%
계기ㆍ계량반 9
 
1.7%
Other values (240) 398
76.1%
2023-12-13T01:00:57.043578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
116
 
4.5%
113
 
4.4%
112
 
4.4%
111
 
4.3%
98
 
3.8%
98
 
3.8%
97
 
3.8%
89
 
3.5%
81
 
3.2%
76
 
3.0%
Other values (223) 1571
61.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2294
89.5%
Decimal Number 107
 
4.2%
Uppercase Letter 63
 
2.5%
Close Punctuation 32
 
1.2%
Open Punctuation 32
 
1.2%
Space Separator 20
 
0.8%
Other Punctuation 12
 
0.5%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
116
 
5.1%
113
 
4.9%
112
 
4.9%
111
 
4.8%
98
 
4.3%
98
 
4.3%
97
 
4.2%
89
 
3.9%
81
 
3.5%
76
 
3.3%
Other values (191) 1303
56.8%
Uppercase Letter
ValueCountFrequency (%)
T 17
27.0%
F 8
12.7%
C 6
 
9.5%
L 6
 
9.5%
G 5
 
7.9%
M 5
 
7.9%
I 4
 
6.3%
P 4
 
6.3%
N 3
 
4.8%
O 2
 
3.2%
Other values (3) 3
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 29
27.1%
2 23
21.5%
4 9
 
8.4%
3 9
 
8.4%
6 8
 
7.5%
5 8
 
7.5%
7 8
 
7.5%
8 6
 
5.6%
9 5
 
4.7%
0 2
 
1.9%
Other Punctuation
ValueCountFrequency (%)
· 9
75.0%
& 2
 
16.7%
/ 1
 
8.3%
Close Punctuation
ValueCountFrequency (%)
) 30
93.8%
] 2
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 30
93.8%
[ 2
 
6.2%
Space Separator
ValueCountFrequency (%)
20
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2294
89.5%
Common 205
 
8.0%
Latin 63
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
116
 
5.1%
113
 
4.9%
112
 
4.9%
111
 
4.8%
98
 
4.3%
98
 
4.3%
97
 
4.2%
89
 
3.9%
81
 
3.5%
76
 
3.3%
Other values (191) 1303
56.8%
Common
ValueCountFrequency (%)
) 30
14.6%
( 30
14.6%
1 29
14.1%
2 23
11.2%
20
9.8%
4 9
 
4.4%
· 9
 
4.4%
3 9
 
4.4%
6 8
 
3.9%
5 8
 
3.9%
Other values (9) 30
14.6%
Latin
ValueCountFrequency (%)
T 17
27.0%
F 8
12.7%
C 6
 
9.5%
L 6
 
9.5%
G 5
 
7.9%
M 5
 
7.9%
I 4
 
6.3%
P 4
 
6.3%
N 3
 
4.8%
O 2
 
3.2%
Other values (3) 3
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2275
88.8%
ASCII 259
 
10.1%
Compat Jamo 19
 
0.7%
None 9
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
116
 
5.1%
113
 
5.0%
112
 
4.9%
111
 
4.9%
98
 
4.3%
98
 
4.3%
97
 
4.3%
89
 
3.9%
81
 
3.6%
76
 
3.3%
Other values (190) 1284
56.4%
ASCII
ValueCountFrequency (%)
) 30
 
11.6%
( 30
 
11.6%
1 29
 
11.2%
2 23
 
8.9%
20
 
7.7%
T 17
 
6.6%
4 9
 
3.5%
3 9
 
3.5%
F 8
 
3.1%
6 8
 
3.1%
Other values (21) 76
29.3%
Compat Jamo
ValueCountFrequency (%)
19
100.0%
None
ValueCountFrequency (%)
· 9
100.0%

등록아이디
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
182361
482 
20-007
 
18
<NA>
 
1
141944
 
1
19134
 
1

Length

Max length6
Median length6
Mean length5.9940358
Min length4

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row182361
2nd row182361
3rd row182361
4th row182361
5th row182361

Common Values

ValueCountFrequency (%)
182361 482
95.8%
20-007 18
 
3.6%
<NA> 1
 
0.2%
141944 1
 
0.2%
19134 1
 
0.2%

Length

2023-12-13T01:00:57.166996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:00:57.258223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
182361 482
95.8%
20-007 18
 
3.6%
na 1
 
0.2%
141944 1
 
0.2%
19134 1
 
0.2%

Missing values

2023-12-13T01:00:55.909577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:00:55.983936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

부서코드부서명등록아이디
0D0000003사장182361
1D0000004감사182361
2D0001406경영전략본부182361
3D0000940정비사업본부182361
4D0000984에너지사업본부182361
5D0000941안전품질처182361
6D0001351고려아연 배관운영사업 계약추진 TFT182361
7D0001376저압프로세스파트182361
8D00011055반(구간)[군자분소]182361
9D0000078감사실182361
부서코드부서명등록아이디
493D0001446수소액화PMC부182361
494D00011735구간182361
495D00011746구간182361
496D00012141구간182361
497D00012152구간182361
498D00012354구간182361
499D00011934구간(영동)182361
500D00011945구간(영동)182361
501D00011956구간(거창)182361
502D0001196시설물182361