Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 191 |
Missing cells | 109 |
Missing cells (%) | 14.3% |
Duplicate rows | 4 |
Duplicate rows (%) | 2.1% |
Total size in memory | 6.3 KiB |
Average record size in memory | 33.7 B |
Variable types
Text | 3 |
---|---|
Categorical | 1 |
Dataset
Description | 한국가스안전공사 검사대상이 되는 독성가스 191종의 물성 정보(가스명, 화학기호, 검사주기)에 관한 데이터로, 일반 국민분들에게 전반적인 독성가스에 관한 정보를 제공하기 위해 공개하는 데이터입니다. |
---|---|
URL | https://www.data.go.kr/data/15067783/fileData.do |
Dataset has 4 (2.1%) duplicate rows | Duplicates |
화학기호 has 102 (53.4%) missing values | Missing |
카스번호(CAS No) has 7 (3.7%) missing values | Missing |
Reproduction
Analysis started | 2023-12-12 02:37:16.059034 |
---|---|
Analysis finished | 2023-12-12 02:37:16.486165 |
Duration | 0.43 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
가스명
Text
Distinct | 174 |
---|---|
Distinct (%) | 91.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.6 KiB |
Value | Count | Frequency (%) |
0.1%b2h6/h2 | 5 | 2.2% |
5%b2h6/n2 | 5 | 2.2% |
co | 5 | 2.2% |
3 | 1.3% | |
bcl3 | 3 | 1.3% |
n2+sif4 | 3 | 1.3% |
15%b2h6 | 3 | 1.3% |
암모니아 | 2 | 0.9% |
toxic | 2 | 0.9% |
0.95%f2/3.5%ar/ne | 2 | 0.9% |
Other values (180) | 190 |
Most occurring characters
Value | Count | Frequency (%) |
H | 121 | 7.0% |
2 | 116 | 6.7% |
/ | 98 | 5.7% |
C | 80 | 4.6% |
N | 67 | 3.9% |
% | 59 | 3.4% |
O | 51 | 3.0% |
B | 41 | 2.4% |
3 | 40 | 2.3% |
F | 40 | 2.3% |
Other values (141) | 1014 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 669 | |
Other Letter | 327 | |
Decimal Number | 300 | |
Other Punctuation | 210 | 12.2% |
Lowercase Letter | 120 | 6.9% |
Space Separator | 32 | 1.9% |
Math Symbol | 24 | 1.4% |
Open Punctuation | 22 | 1.3% |
Close Punctuation | 20 | 1.2% |
Dash Punctuation | 3 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
화 | 28 | 8.6% |
소 | 22 | 6.7% |
스 | 11 | 3.4% |
불 | 11 | 3.4% |
로 | 10 | 3.1% |
아 | 9 | 2.8% |
오 | 9 | 2.8% |
수 | 8 | 2.4% |
사 | 7 | 2.1% |
산 | 7 | 2.1% |
Other values (79) | 205 |
Uppercase Letter
Value | Count | Frequency (%) |
H | 121 | |
C | 80 | |
N | 67 | |
O | 51 | 7.6% |
B | 41 | 6.1% |
F | 40 | 6.0% |
A | 37 | 5.5% |
S | 35 | 5.2% |
E | 28 | 4.2% |
L | 23 | 3.4% |
Other values (15) | 146 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 38 | |
r | 24 | |
i | 13 | 10.8% |
l | 13 | 10.8% |
o | 7 | 5.8% |
n | 4 | 3.3% |
a | 4 | 3.3% |
t | 3 | 2.5% |
s | 2 | 1.7% |
d | 2 | 1.7% |
Other values (8) | 10 | 8.3% |
Decimal Number
Value | Count | Frequency (%) |
2 | 116 | |
3 | 40 | 13.3% |
1 | 34 | 11.3% |
5 | 31 | 10.3% |
6 | 26 | 8.7% |
4 | 22 | 7.3% |
0 | 20 | 6.7% |
9 | 5 | 1.7% |
8 | 4 | 1.3% |
7 | 2 | 0.7% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 98 | |
% | 59 | |
, | 31 | 14.8% |
. | 22 | 10.5% |
Space Separator
Value | Count | Frequency (%) |
32 |
Math Symbol
Value | Count | Frequency (%) |
+ | 24 |
Open Punctuation
Value | Count | Frequency (%) |
( | 22 |
Close Punctuation
Value | Count | Frequency (%) |
) | 20 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 3 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 789 | |
Common | 611 | |
Hangul | 327 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
화 | 28 | 8.6% |
소 | 22 | 6.7% |
스 | 11 | 3.4% |
불 | 11 | 3.4% |
로 | 10 | 3.1% |
아 | 9 | 2.8% |
오 | 9 | 2.8% |
수 | 8 | 2.4% |
사 | 7 | 2.1% |
산 | 7 | 2.1% |
Other values (79) | 205 |
Latin
Value | Count | Frequency (%) |
H | 121 | |
C | 80 | 10.1% |
N | 67 | 8.5% |
O | 51 | 6.5% |
B | 41 | 5.2% |
F | 40 | 5.1% |
e | 38 | 4.8% |
A | 37 | 4.7% |
S | 35 | 4.4% |
E | 28 | 3.5% |
Other values (33) | 251 |
Common
Value | Count | Frequency (%) |
2 | 116 | |
/ | 98 | |
% | 59 | |
3 | 40 | 6.5% |
1 | 34 | 5.6% |
32 | 5.2% | |
5 | 31 | 5.1% |
, | 31 | 5.1% |
6 | 26 | 4.3% |
+ | 24 | 3.9% |
Other values (9) | 120 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1400 | |
Hangul | 327 | 18.9% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
H | 121 | 8.6% |
2 | 116 | 8.3% |
/ | 98 | 7.0% |
C | 80 | 5.7% |
N | 67 | 4.8% |
% | 59 | 4.2% |
O | 51 | 3.6% |
B | 41 | 2.9% |
3 | 40 | 2.9% |
F | 40 | 2.9% |
Other values (52) | 687 |
Hangul
Value | Count | Frequency (%) |
화 | 28 | 8.6% |
소 | 22 | 6.7% |
스 | 11 | 3.4% |
불 | 11 | 3.4% |
로 | 10 | 3.1% |
아 | 9 | 2.8% |
오 | 9 | 2.8% |
수 | 8 | 2.4% |
사 | 7 | 2.1% |
산 | 7 | 2.1% |
Other values (79) | 205 |
화학기호
Text
MISSING
 
Distinct | 75 |
---|---|
Distinct (%) | 84.3% |
Missing | 102 |
Missing (%) | 53.4% |
Memory size | 1.6 KiB |
Value | Count | Frequency (%) |
so2 | 4 | 4.4% |
nh3 | 3 | 3.3% |
bf3 | 3 | 3.3% |
gef4 | 3 | 3.3% |
15%b2h6 | 3 | 3.3% |
sif4 | 2 | 2.2% |
sih4 | 2 | 2.2% |
bcl3 | 2 | 2.2% |
b2h6 | 2 | 2.2% |
hf | 2 | 2.2% |
Other values (60) | 64 |
Most occurring characters
Value | Count | Frequency (%) |
H | 62 | 13.5% |
2 | 50 | 10.9% |
C | 32 | 7.0% |
3 | 29 | 6.3% |
F | 24 | 5.2% |
N | 21 | 4.6% |
S | 21 | 4.6% |
4 | 21 | 4.6% |
B | 18 | 3.9% |
% | 17 | 3.7% |
Other values (31) | 163 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 228 | |
Decimal Number | 145 | |
Lowercase Letter | 40 | 8.7% |
Other Punctuation | 32 | 7.0% |
Math Symbol | 8 | 1.7% |
Close Punctuation | 2 | 0.4% |
Open Punctuation | 2 | 0.4% |
Space Separator | 1 | 0.2% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
H | 62 | |
C | 32 | |
F | 24 | 10.5% |
N | 21 | 9.2% |
S | 21 | 9.2% |
B | 18 | 7.9% |
O | 16 | 7.0% |
P | 7 | 3.1% |
L | 6 | 2.6% |
A | 6 | 2.6% |
Other values (7) | 15 | 6.6% |
Decimal Number
Value | Count | Frequency (%) |
2 | 50 | |
3 | 29 | |
4 | 21 | |
6 | 17 | 11.7% |
5 | 11 | 7.6% |
1 | 9 | 6.2% |
0 | 5 | 3.4% |
8 | 2 | 1.4% |
9 | 1 | 0.7% |
Lowercase Letter
Value | Count | Frequency (%) |
i | 10 | |
e | 9 | |
l | 7 | |
r | 6 | |
s | 3 | 7.5% |
c | 2 | 5.0% |
a | 2 | 5.0% |
b | 1 | 2.5% |
Other Punctuation
Value | Count | Frequency (%) |
% | 17 | |
/ | 12 | |
, | 3 | 9.4% |
Math Symbol
Value | Count | Frequency (%) |
+ | 8 |
Close Punctuation
Value | Count | Frequency (%) |
) | 2 |
Open Punctuation
Value | Count | Frequency (%) |
( | 2 |
Space Separator
Value | Count | Frequency (%) |
1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 268 | |
Common | 190 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
H | 62 | |
C | 32 | |
F | 24 | 9.0% |
N | 21 | 7.8% |
S | 21 | 7.8% |
B | 18 | 6.7% |
O | 16 | 6.0% |
i | 10 | 3.7% |
e | 9 | 3.4% |
l | 7 | 2.6% |
Other values (15) | 48 |
Common
Value | Count | Frequency (%) |
2 | 50 | |
3 | 29 | |
4 | 21 | |
% | 17 | 8.9% |
6 | 17 | 8.9% |
/ | 12 | 6.3% |
5 | 11 | 5.8% |
1 | 9 | 4.7% |
+ | 8 | 4.2% |
0 | 5 | 2.6% |
Other values (6) | 11 | 5.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 458 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
H | 62 | 13.5% |
2 | 50 | 10.9% |
C | 32 | 7.0% |
3 | 29 | 6.3% |
F | 24 | 5.2% |
N | 21 | 4.6% |
S | 21 | 4.6% |
4 | 21 | 4.6% |
B | 18 | 3.9% |
% | 17 | 3.7% |
Other values (31) | 163 |
검사주기
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 2.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.6 KiB |
0 | |
---|---|
1 | |
12 | 5 |
4 | 2 |
6 | 1 |
Length
Max length | 2 |
---|---|
Median length | 1 |
Mean length | 1.026178 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.5% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 0 |
Common Values
Value | Count | Frequency (%) |
0 | 125 | |
1 | 58 | |
12 | 5 | 2.6% |
4 | 2 | 1.0% |
6 | 1 | 0.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
0 | 125 | |
1 | 58 | |
12 | 5 | 2.6% |
4 | 2 | 1.0% |
6 | 1 | 0.5% |
카스번호(CAS No)
Text
MISSING
 
Distinct | 115 |
---|---|
Distinct (%) | 62.5% |
Missing | 7 |
Missing (%) | 3.7% |
Memory size | 1.6 KiB |
Length
Max length | 124 |
---|---|
Median length | 59 |
Mean length | 24.51087 |
Min length | 7 |
Characters and Unicode
Total characters | 4510 |
---|---|
Distinct characters | 47 |
Distinct categories | 7 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 81 ? |
---|---|
Unique (%) | 44.0% |
Sample
1st row | 7647-01-0 |
---|---|
2nd row | 10294-34-5 |
3rd row | 7783-61-1 |
4th row | 7783-82-6 |
5th row | 7783-60-0 |
Value | Count | Frequency (%) |
192 | ||
7727-37-9 | 23 | 3.7% |
b2h6 | 22 | 3.5% |
7440-01-9 | 18 | 2.9% |
f2 | 14 | 2.2% |
1333-74-0 | 14 | 2.2% |
hcl | 12 | 1.9% |
ph3 | 10 | 1.6% |
19287-45-7/h2 | 9 | 1.4% |
7782-41-4/ar | 8 | 1.3% |
Other values (184) | 308 |
Most occurring characters
Value | Count | Frequency (%) |
- | 665 | |
7 | 504 | |
447 | 9.9% | |
4 | 324 | 7.2% |
2 | 277 | 6.1% |
0 | 270 | 6.0% |
: | 255 | 5.7% |
3 | 255 | 5.7% |
1 | 228 | 5.1% |
9 | 166 | 3.7% |
Other values (37) | 1119 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 2473 | |
Dash Punctuation | 665 | 14.7% |
Space Separator | 447 | 9.9% |
Uppercase Letter | 419 | 9.3% |
Other Punctuation | 405 | 9.0% |
Lowercase Letter | 95 | 2.1% |
Other Letter | 6 | 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
H | 113 | |
C | 66 | |
N | 58 | |
F | 38 | 9.1% |
O | 31 | 7.4% |
B | 31 | 7.4% |
S | 17 | 4.1% |
A | 13 | 3.1% |
P | 12 | 2.9% |
K | 8 | 1.9% |
Other values (10) | 32 | 7.6% |
Decimal Number
Value | Count | Frequency (%) |
7 | 504 | |
4 | 324 | |
2 | 277 | |
0 | 270 | |
3 | 255 | |
1 | 228 | |
9 | 166 | 6.7% |
6 | 161 | 6.5% |
5 | 147 | 5.9% |
8 | 141 | 5.7% |
Lowercase Letter
Value | Count | Frequency (%) |
e | 36 | |
l | 23 | |
r | 19 | |
i | 11 | 11.6% |
o | 3 | 3.2% |
t | 1 | 1.1% |
b | 1 | 1.1% |
a | 1 | 1.1% |
Other Letter
Value | Count | Frequency (%) |
벤 | 2 | |
젠 | 2 | |
놀 | 1 | |
페 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
: | 255 | |
/ | 149 | |
, | 1 | 0.2% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 665 |
Space Separator
Value | Count | Frequency (%) |
447 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 3990 | |
Latin | 514 | 11.4% |
Hangul | 6 | 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
H | 113 | |
C | 66 | |
N | 58 | |
F | 38 | 7.4% |
e | 36 | 7.0% |
O | 31 | 6.0% |
B | 31 | 6.0% |
l | 23 | 4.5% |
r | 19 | 3.7% |
S | 17 | 3.3% |
Other values (18) | 82 |
Common
Value | Count | Frequency (%) |
- | 665 | |
7 | 504 | |
447 | ||
4 | 324 | |
2 | 277 | |
0 | 270 | |
: | 255 | 6.4% |
3 | 255 | 6.4% |
1 | 228 | 5.7% |
9 | 166 | 4.2% |
Other values (5) | 599 |
Hangul
Value | Count | Frequency (%) |
벤 | 2 | |
젠 | 2 | |
놀 | 1 | |
페 | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 4504 | |
Hangul | 6 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
- | 665 | |
7 | 504 | |
447 | ||
4 | 324 | 7.2% |
2 | 277 | 6.2% |
0 | 270 | 6.0% |
: | 255 | 5.7% |
3 | 255 | 5.7% |
1 | 228 | 5.1% |
9 | 166 | 3.7% |
Other values (33) | 1113 |
Hangul
Value | Count | Frequency (%) |
벤 | 2 | |
젠 | 2 | |
놀 | 1 | |
페 | 1 |
화학기호 | 검사주기 | |
---|---|---|
화학기호 | 1.000 | 0.000 |
검사주기 | 0.000 | 1.000 |
가스명 | 화학기호 | 검사주기 | 카스번호(CAS No) | |
---|---|---|---|---|
0 | 염화수소 | Hcl | 1 | 7647-01-0 |
1 | 삼염화붕소 | Bcl3 | 1 | 10294-34-5 |
2 | 사불화규소 | SiF4 | 1 | 7783-61-1 |
3 | 육불화텅스텐 | WF6 | 1 | 7783-82-6 |
4 | 사불화유황 | SF4 | 0 | 7783-60-0 |
5 | 포스핀 | PH3 | 1 | 7803-51-2 |
6 | 디실란 | Si2H6 | 1 | 1590-87-0 |
7 | 삼불화붕소 | BF3 | 1 | 7637-07-02 |
8 | 아크릴로니트릴 | C2H3CN | 1 | 107-13-1 |
9 | 아크릴알데히드 | C3H4O | 1 | 107-02-8 |
가스명 | 화학기호 | 검사주기 | 카스번호(CAS No) | |
---|---|---|---|---|
181 | 0.1%B2H6/H2 | <NA> | 1 | B2H6 : 19287-45-7/H2 : 1333-74-0 |
182 | 0.1%B2H6/H2 | <NA> | 1 | B2H6 : 19287-45-7/H2 : 1333-74-0 |
183 | TOXIC | <NA> | 4 | <NA> |
184 | MTBE/WATER | <NA> | 0 | MTBE : 1634-04-4 |
185 | N2+SiF4 | <NA> | 0 | SiH4 : 7803-62-5/N2 : 7727-37-9 |
186 | PH3+Ar | PH3+Ar | 0 | PH3 : 7803-51-2/Ar : 7440-37-1 |
187 | CH3CL(17%)+HF(83%) | <NA> | 0 | CH3Cl : 74-87-3/HF : 7664-39-3 |
188 | 5%B2H6/N2 | <NA> | 1 | B2H6 : 19287-45-7/N2 : 7727-37-9 |
189 | 옥타플루오르화부테인 | C4F8 | 12 | C4F8 : 115-25-3 |
190 | HBr acid | <NA> | 0 | 10035-10-6 |
Most frequently occurring
가스명 | 화학기호 | 검사주기 | 카스번호(CAS No) | # duplicates | |
---|---|---|---|---|---|
0 | 0.1%B2H6/H2 | <NA> | 1 | B2H6 : 19287-45-7/H2 : 1333-74-0 | 5 |
1 | 0.95%F2/3.5%Ar/Ne | <NA> | 0 | F2: 7782-41-4/Ar : 7440-37-1/Ne: 7440-01-9 | 2 |
2 | 5%B2H6/N2 | <NA> | 0 | B2H6 : 19287-45-7/N2 : 7727-37-9 | 2 |
3 | 5%B2H6/N2 | <NA> | 1 | B2H6 : 19287-45-7/N2 : 7727-37-9 | 2 |