Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 100 |
Missing cells | 100 |
Missing cells (%) | 11.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 7.3 KiB |
Average record size in memory | 74.3 B |
Variable types
Text | 2 |
---|---|
Unsupported | 1 |
Boolean | 1 |
Categorical | 5 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 그린에코스 |
URL | https://www.bigdata-environment.kr/user/data_market/detail.do?id=da1851f0-c0d7-11ea-b78f-33609a6276e1 |
출처 has constant value "" | Constant |
갱신내용 has constant value "" | Constant |
플래그 is highly overall correlated with UVCB 여부 and 2 other fields | High correlation |
UVCB 여부 is highly overall correlated with 플래그 and 1 other fields | High correlation |
플래그 정의 is highly overall correlated with UVCB 여부 and 2 other fields | High correlation |
상용여부 is highly overall correlated with 플래그 and 1 other fields | High correlation |
UVCB 여부 is highly imbalanced (85.9%) | Imbalance |
플래그 is highly imbalanced (64.9%) | Imbalance |
플래그 정의 is highly imbalanced (64.9%) | Imbalance |
정의 has 100 (100.0%) missing values | Missing |
정의 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2023-12-10 11:21:27.670024 |
---|---|
Analysis finished | 2023-12-10 11:21:28.911404 |
Duration | 1.24 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
CAS등록번호
Text
Distinct | 94 |
---|---|
Distinct (%) | 94.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
50-45-3 | 3 | 3.0% |
50-30-6 | 3 | 3.0% |
51-36-5 | 3 | 3.0% |
51-83-2 | 1 | 1.0% |
51-57-0 | 1 | 1.0% |
51-52-5 | 1 | 1.0% |
51-48-9 | 1 | 1.0% |
51-46-7 | 1 | 1.0% |
51-45-6 | 1 | 1.0% |
51-44-5 | 1 | 1.0% |
Other values (84) | 84 |
Most occurring characters
Value | Count | Frequency (%) |
- | 200 | |
5 | 112 | |
1 | 84 | |
0 | 73 | 9.7% |
2 | 50 | 6.6% |
3 | 48 | 6.3% |
6 | 40 | 5.3% |
7 | 40 | 5.3% |
4 | 38 | 5.0% |
9 | 36 | 4.8% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 556 | |
Dash Punctuation | 200 | 26.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
5 | 112 | |
1 | 84 | |
0 | 73 | |
2 | 50 | |
3 | 48 | |
6 | 40 | 7.2% |
7 | 40 | 7.2% |
4 | 38 | 6.8% |
9 | 36 | 6.5% |
8 | 35 | 6.3% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 200 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 756 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
- | 200 | |
5 | 112 | |
1 | 84 | |
0 | 73 | 9.7% |
2 | 50 | 6.6% |
3 | 48 | 6.3% |
6 | 40 | 5.3% |
7 | 40 | 5.3% |
4 | 38 | 5.0% |
9 | 36 | 4.8% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 756 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
- | 200 | |
5 | 112 | |
1 | 84 | |
0 | 73 | 9.7% |
2 | 50 | 6.6% |
3 | 48 | 6.3% |
6 | 40 | 5.3% |
7 | 40 | 5.3% |
4 | 38 | 5.0% |
9 | 36 | 4.8% |
화학물질영문
Text
Distinct | 94 |
---|---|
Distinct (%) | 94.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Length
Max length | 100 |
---|---|
Median length | 54 |
Mean length | 37.81 |
Min length | 6 |
Characters and Unicode
Total characters | 3781 |
---|---|
Distinct characters | 59 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 91 ? |
---|---|
Unique (%) | 91.0% |
Sample
1st row | Formaldehyde |
---|---|
2nd row | 3-Hexanol |
3rd row | 2-Propanol, 1-phenoxy- |
4th row | 2-Propanol, 1,1,1,3,3,3-hexafluoro- |
5th row | Guanidine, hydrochloride (1:1) |
Value | Count | Frequency (%) |
acid | 27 | 10.5% |
benzoic | 14 | 5.5% |
1:1 | 9 | 3.5% |
ester | 9 | 3.5% |
hydrochloride | 6 | 2.3% |
benzene | 4 | 1.6% |
phenol | 4 | 1.6% |
2,6-dichloro | 3 | 1.2% |
11.beta | 3 | 1.2% |
3,5-dichloro | 3 | 1.2% |
Other values (158) | 174 |
Most occurring characters
Value | Count | Frequency (%) |
- | 321 | 8.5% |
e | 279 | 7.4% |
o | 243 | 6.4% |
i | 219 | 5.8% |
, | 205 | 5.4% |
n | 176 | 4.7% |
a | 163 | 4.3% |
l | 160 | 4.2% |
157 | 4.2% | |
h | 150 | 4.0% |
Other values (49) | 1708 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 2369 | |
Decimal Number | 347 | 9.2% |
Dash Punctuation | 321 | 8.5% |
Other Punctuation | 264 | 7.0% |
Uppercase Letter | 164 | 4.3% |
Space Separator | 157 | 4.2% |
Open Punctuation | 83 | 2.2% |
Close Punctuation | 76 | 2.0% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 279 | |
o | 243 | |
i | 219 | 9.2% |
n | 176 | 7.4% |
a | 163 | 6.9% |
l | 160 | 6.8% |
h | 150 | 6.3% |
y | 147 | 6.2% |
d | 141 | 6.0% |
t | 136 | 5.7% |
Other values (11) | 555 |
Uppercase Letter
Value | Count | Frequency (%) |
B | 28 | |
P | 27 | |
N | 21 | |
H | 18 | |
S | 10 | 6.1% |
O | 8 | 4.9% |
E | 8 | 4.9% |
R | 8 | 4.9% |
C | 7 | 4.3% |
A | 6 | 3.7% |
Other values (8) | 23 |
Decimal Number
Value | Count | Frequency (%) |
1 | 111 | |
2 | 64 | |
3 | 52 | |
4 | 41 | 11.8% |
5 | 18 | 5.2% |
8 | 16 | 4.6% |
6 | 15 | 4.3% |
7 | 14 | 4.0% |
9 | 9 | 2.6% |
0 | 7 | 2.0% |
Other Punctuation
Value | Count | Frequency (%) |
, | 205 | |
. | 41 | 15.5% |
: | 12 | 4.5% |
' | 6 | 2.3% |
Open Punctuation
Value | Count | Frequency (%) |
( | 60 | |
[ | 23 | 27.7% |
Close Punctuation
Value | Count | Frequency (%) |
) | 57 | |
] | 19 | 25.0% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 321 |
Space Separator
Value | Count | Frequency (%) |
157 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 2533 | |
Common | 1248 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 279 | 11.0% |
o | 243 | 9.6% |
i | 219 | 8.6% |
n | 176 | 6.9% |
a | 163 | 6.4% |
l | 160 | 6.3% |
h | 150 | 5.9% |
y | 147 | 5.8% |
d | 141 | 5.6% |
t | 136 | 5.4% |
Other values (29) | 719 |
Common
Value | Count | Frequency (%) |
- | 321 | |
, | 205 | |
157 | ||
1 | 111 | 8.9% |
2 | 64 | 5.1% |
( | 60 | 4.8% |
) | 57 | 4.6% |
3 | 52 | 4.2% |
. | 41 | 3.3% |
4 | 41 | 3.3% |
Other values (10) | 139 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3781 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
- | 321 | 8.5% |
e | 279 | 7.4% |
o | 243 | 6.4% |
i | 219 | 5.8% |
, | 205 | 5.4% |
n | 176 | 4.7% |
a | 163 | 4.3% |
l | 160 | 4.2% |
157 | 4.2% | |
h | 150 | 4.0% |
Other values (49) | 1708 |
정의
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 100 |
---|---|
Missing (%) | 100.0% |
Memory size | 1.0 KiB |
UVCB 여부
Boolean
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 232.0 B |
False | |
---|---|
True | 2 |
Value | Count | Frequency (%) |
False | 98 | |
True | 2 | 2.0% |
플래그
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 5.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
<NA> | |
---|---|
S | 4 |
PMN | 3 |
SP | 3 |
5E | 3 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.73 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 87 | |
S | 4 | 4.0% |
PMN | 3 | 3.0% |
SP | 3 | 3.0% |
5E | 3 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 87 | |
s | 4 | 4.0% |
pmn | 3 | 3.0% |
sp | 3 | 3.0% |
5e | 3 | 3.0% |
플래그 정의
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 5.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
<NA> | |
---|---|
SNUR(중요신규용도규정) 대상물질(a substance that is identified in a final Significant New?Use Rule) | 4 |
사전제조신고 물질(a commenced PMN substance) | 3 |
SNUR(중요신규용도규정(안)) 대상물질(a substance that is identified in a proposed Significant New Use Rule) | 3 |
TSCA section 5(e) 대상물질C3:C16 (a substance that is the subject of a TSCA section 5(e) order) | 3 |
Length
Max length | 93 |
---|---|
Median length | 4 |
Mean length | 13.56 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 87 | |
SNUR(중요신규용도규정) 대상물질(a substance that is identified in a final Significant New?Use Rule) | 4 | 4.0% |
사전제조신고 물질(a commenced PMN substance) | 3 | 3.0% |
SNUR(중요신규용도규정(안)) 대상물질(a substance that is identified in a proposed Significant New Use Rule) | 3 | 3.0% |
TSCA section 5(e) 대상물질C3:C16 (a substance that is the subject of a TSCA section 5(e) order) | 3 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 87 | |
substance | 13 | 5.5% |
a | 13 | 5.5% |
that | 10 | 4.2% |
is | 10 | 4.2% |
in | 7 | 3.0% |
rule | 7 | 3.0% |
significant | 7 | 3.0% |
identified | 7 | 3.0% |
대상물질(a | 7 | 3.0% |
Other values (19) | 69 |
상용여부
Categorical
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
활성화 | |
---|---|
비활성화 |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 3.23 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 활성화 |
---|---|
2nd row | 활성화 |
3rd row | 활성화 |
4th row | 활성화 |
5th row | 활성화 |
Common Values
Value | Count | Frequency (%) |
활성화 | 77 | |
비활성화 | 23 | 23.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
활성화 | 77 | |
비활성화 | 23 | 23.0% |
출처
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
ACToR |
---|
Length
Max length | 5 |
---|---|
Median length | 5 |
Mean length | 5 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | ACToR |
---|---|
2nd row | ACToR |
3rd row | ACToR |
4th row | ACToR |
5th row | ACToR |
Common Values
Value | Count | Frequency (%) |
ACToR | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
actor | 100 |
갱신내용
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Last updated 09/2019 |
---|
Length
Max length | 20 |
---|---|
Median length | 20 |
Mean length | 20 |
Min length | 20 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Last updated 09/2019 |
---|---|
2nd row | Last updated 09/2019 |
3rd row | Last updated 09/2019 |
4th row | Last updated 09/2019 |
5th row | Last updated 09/2019 |
Common Values
Value | Count | Frequency (%) |
Last updated 09/2019 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
last | 100 | |
updated | 100 | |
09/2019 | 100 |
CAS등록번호 | 화학물질영문 | UVCB 여부 | 플래그 | 플래그 정의 | 상용여부 | |
---|---|---|---|---|---|---|
CAS등록번호 | 1.000 | 1.000 | 1.000 | 0.000 | 0.000 | 1.000 |
화학물질영문 | 1.000 | 1.000 | 1.000 | 0.000 | 0.000 | 1.000 |
UVCB 여부 | 1.000 | 1.000 | 1.000 | NaN | NaN | 0.000 |
플래그 | 0.000 | 0.000 | NaN | 1.000 | 1.000 | NaN |
플래그 정의 | 0.000 | 0.000 | NaN | 1.000 | 1.000 | NaN |
상용여부 | 1.000 | 1.000 | 0.000 | NaN | NaN | 1.000 |
플래그 | UVCB 여부 | 플래그 정의 | 상용여부 | |
---|---|---|---|---|
플래그 | 1.000 | 1.000 | 1.000 | 1.000 |
UVCB 여부 | 1.000 | 1.000 | 1.000 | 0.000 |
플래그 정의 | 1.000 | 1.000 | 1.000 | 1.000 |
상용여부 | 1.000 | 0.000 | 1.000 | 1.000 |
UVCB 여부 | 플래그 | 플래그 정의 | 상용여부 | |
---|---|---|---|---|
UVCB 여부 | 1.000 | 1.000 | 1.000 | 0.000 |
플래그 | 1.000 | 1.000 | 1.000 | 1.000 |
플래그 정의 | 1.000 | 1.000 | 1.000 | 1.000 |
상용여부 | 0.000 | 1.000 | 1.000 | 1.000 |
CAS등록번호 | 화학물질영문 | 정의 | UVCB 여부 | 플래그 | 플래그 정의 | 상용여부 | 출처 | 갱신내용 | |
---|---|---|---|---|---|---|---|---|---|
0 | 50-00-0 | Formaldehyde | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
1 | 623-37-0 | 3-Hexanol | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
2 | 770-35-4 | 2-Propanol, 1-phenoxy- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
3 | 920-66-1 | 2-Propanol, 1,1,1,3,3,3-hexafluoro- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
4 | 50-01-1 | Guanidine, hydrochloride (1:1) | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
5 | 1070-40-2 | 5-Decyne-4,7-diol | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
6 | 1117-79-9 | Octane, 3-chloro- | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |
7 | 1122-81-2 | Pyridine, 4-propyl- | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |
8 | 50-02-2 | Pregna-1,4-diene-3,20-dione, 9-fluoro-11,17,21-trihydroxy-16-methyl-, (11.beta.,16.alpha.)- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
9 | 1313-60-6 | Sodium peroxide (Na2(O2)) | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
CAS등록번호 | 화학물질영문 | 정의 | UVCB 여부 | 플래그 | 플래그 정의 | 상용여부 | 출처 | 갱신내용 | |
---|---|---|---|---|---|---|---|---|---|
90 | 51-93-4 | Ethanaminium, N,N,N-trimethyl-, iodide (1:1) | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |
91 | 52-01-7 | Pregn-4-ene-21-carboxylic acid, 7-(acetylthio)-17-hydroxy-3-oxo-, .gamma.-lactone, (7.alpha.,17.alph | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |
92 | 52-39-1 | Pregn-4-en-18-al, 11,21-dihydroxy-3,20-dioxo-, (11.beta.)- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
93 | 52-51-7 | 1,3-Propanediol, 2-bromo-2-nitro- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
94 | 52-52-8 | Cyclopentanecarboxylic acid, 1-amino- | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
95 | 52-85-7 | Phosphorothioic acid, O-[4-[(dimethylamino)sulfonyl]phenyl] O,O-dimethyl ester | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
96 | 52-88-0 | 8-Azoniabicyclo[3.2.1]octane, 3-(3-hydroxy-1-oxo-2-phenylpropoxy)-8,8-dimethyl-, (3-endo)-, nitrate | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |
97 | 52-89-1 | L-Cysteine, hydrochloride (1:1) | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
98 | 52-90-4 | L-Cysteine | <NA> | N | <NA> | <NA> | 활성화 | ACToR | Last updated 09/2019 |
99 | 53-03-2 | Pregna-1,4-diene-3,11,20-trione, 17,21-dihydroxy- | <NA> | N | <NA> | <NA> | 비활성화 | ACToR | Last updated 09/2019 |