Overview

Dataset statistics

Number of variables10
Number of observations7559
Missing cells7997
Missing cells (%)10.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory642.3 KiB
Average record size in memory87.0 B

Variable types

Text3
Categorical5
Numeric1
Unsupported1

Alerts

노출_피부 is highly overall correlated with 사고노출위험구분수 and 3 other fieldsHigh correlation
노출_흡입 is highly overall correlated with 사고노출위험구분수 and 3 other fieldsHigh correlation
노출_안구 is highly overall correlated with 사고노출위험구분수 and 3 other fieldsHigh correlation
노출_경구 is highly overall correlated with 사고노출위험구분수 and 3 other fieldsHigh correlation
사고노출위험구분수 is highly overall correlated with 인체_일반증상 and 4 other fieldsHigh correlation
인체_일반증상 is highly overall correlated with 사고노출위험구분수High correlation
국문 has 438 (5.8%) missing valuesMissing
Unnamed: 9 has 7559 (100.0%) missing valuesMissing
고유(CAS)번호 has unique valuesUnique
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
사고노출위험구분수 has 1737 (23.0%) zerosZeros

Reproduction

Analysis started2024-01-09 21:27:31.228977
Analysis finished2024-01-09 21:27:32.224423
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

고유(CAS)번호
Text

UNIQUE 

Distinct7559
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
2024-01-10T06:27:32.423263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length9.9757904
Min length8

Characters and Unicode

Total characters75407
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7559 ?
Unique (%)100.0%

Sample

1st row 75-31-0
2nd row 681-84-5
3rd row 57-92-1
4th row 75-12-7
5th row 75-20-7
ValueCountFrequency (%)
75-31-0 1
 
< 0.1%
78-87-5 1
 
< 0.1%
79-34-5 1
 
< 0.1%
793-24-8 1
 
< 0.1%
79-24-3 1
 
< 0.1%
79-22-1 1
 
< 0.1%
79-21-0 1
 
< 0.1%
79-20-9 1
 
< 0.1%
79-19-6 1
 
< 0.1%
78-96-6 1
 
< 0.1%
Other values (7549) 7549
99.9%
2024-01-10T06:27:32.767480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 15118
20.0%
1 7779
10.3%
7559
10.0%
2 5498
 
7.3%
0 5233
 
6.9%
5 5108
 
6.8%
3 5089
 
6.7%
6 4992
 
6.6%
7 4970
 
6.6%
4 4875
 
6.5%
Other values (2) 9186
12.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52730
69.9%
Dash Punctuation 15118
 
20.0%
Space Separator 7559
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7779
14.8%
2 5498
10.4%
0 5233
9.9%
5 5108
9.7%
3 5089
9.7%
6 4992
9.5%
7 4970
9.4%
4 4875
9.2%
8 4664
8.8%
9 4522
8.6%
Dash Punctuation
ValueCountFrequency (%)
- 15118
100.0%
Space Separator
ValueCountFrequency (%)
7559
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 75407
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 15118
20.0%
1 7779
10.3%
7559
10.0%
2 5498
 
7.3%
0 5233
 
6.9%
5 5108
 
6.8%
3 5089
 
6.7%
6 4992
 
6.6%
7 4970
 
6.6%
4 4875
 
6.5%
Other values (2) 9186
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75407
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 15118
20.0%
1 7779
10.3%
7559
10.0%
2 5498
 
7.3%
0 5233
 
6.9%
5 5108
 
6.8%
3 5089
 
6.7%
6 4992
 
6.6%
7 4970
 
6.6%
4 4875
 
6.5%
Other values (2) 9186
12.2%

영문
Text

Distinct7504
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
2024-01-10T06:27:32.932143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length343
Median length240
Mean length29.696918
Min length2

Characters and Unicode

Total characters224479
Distinct characters105
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7455 ?
Unique (%)98.6%

Sample

1st rowisopropylamine
2nd rowTetramethyl,silicate
3rd rowStreptomycin
4th rowFormamide
5th rowCalcium,carbide
ValueCountFrequency (%)
acid 108
 
1.2%
salt 59
 
0.7%
with 41
 
0.5%
1:1 40
 
0.5%
sodium 38
 
0.4%
chloride 32
 
0.4%
ester 19
 
0.2%
potassium 18
 
0.2%
sulfate 16
 
0.2%
nickel 15
 
0.2%
Other values (7803) 8299
95.6%
2024-01-10T06:27:33.218024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 20205
 
9.0%
o 15874
 
7.1%
i 14836
 
6.6%
l 13604
 
6.1%
t 12312
 
5.5%
n 12245
 
5.5%
a 11822
 
5.3%
, 11503
 
5.1%
- 10919
 
4.9%
y 10135
 
4.5%
Other values (95) 91024
40.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 173978
77.5%
Other Punctuation 12476
 
5.6%
Dash Punctuation 10919
 
4.9%
Decimal Number 10856
 
4.8%
Uppercase Letter 9863
 
4.4%
Open Punctuation 2577
 
1.1%
Close Punctuation 2568
 
1.1%
Space Separator 1131
 
0.5%
Math Symbol 91
 
< 0.1%
Modifier Symbol 13
 
< 0.1%
Other values (2) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 20205
11.6%
o 15874
 
9.1%
i 14836
 
8.5%
l 13604
 
7.8%
t 12312
 
7.1%
n 12245
 
7.0%
a 11822
 
6.8%
y 10135
 
5.8%
r 9918
 
5.7%
h 9818
 
5.6%
Other values (25) 43209
24.8%
Uppercase Letter
ValueCountFrequency (%)
D 1160
11.8%
T 902
 
9.1%
C 862
 
8.7%
B 787
 
8.0%
N 774
 
7.8%
M 724
 
7.3%
P 712
 
7.2%
I 579
 
5.9%
H 565
 
5.7%
A 543
 
5.5%
Other values (17) 2255
22.9%
Other Punctuation
ValueCountFrequency (%)
, 11503
92.2%
' 364
 
2.9%
. 189
 
1.5%
: 146
 
1.2%
; 119
 
1.0%
/ 67
 
0.5%
40
 
0.3%
" 19
 
0.2%
* 15
 
0.1%
% 6
 
< 0.1%
Other values (4) 8
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 3301
30.4%
1 2466
22.7%
3 1677
15.4%
4 1594
14.7%
5 753
 
6.9%
6 506
 
4.7%
7 172
 
1.6%
8 154
 
1.4%
9 129
 
1.2%
0 104
 
1.0%
Math Symbol
ValueCountFrequency (%)
+ 51
56.0%
= 26
28.6%
~ 7
 
7.7%
± 3
 
3.3%
> 2
 
2.2%
< 1
 
1.1%
1
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 1843
71.8%
] 719
 
28.0%
} 6
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 1837
71.3%
[ 736
28.6%
{ 4
 
0.2%
Letter Number
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 10919
100.0%
Space Separator
ValueCountFrequency (%)
1131
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 13
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 183679
81.8%
Common 40634
 
18.1%
Greek 166
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 20205
 
11.0%
o 15874
 
8.6%
i 14836
 
8.1%
l 13604
 
7.4%
t 12312
 
6.7%
n 12245
 
6.7%
a 11822
 
6.4%
y 10135
 
5.5%
r 9918
 
5.4%
h 9818
 
5.3%
Other values (44) 52910
28.8%
Common
ValueCountFrequency (%)
, 11503
28.3%
- 10919
26.9%
2 3301
 
8.1%
1 2466
 
6.1%
) 1843
 
4.5%
( 1837
 
4.5%
3 1677
 
4.1%
4 1594
 
3.9%
1131
 
2.8%
5 753
 
1.9%
Other values (31) 3610
 
8.9%
Greek
ValueCountFrequency (%)
α 87
52.4%
ω 35
21.1%
β 15
 
9.0%
κ 10
 
6.0%
μ 8
 
4.8%
λ 4
 
2.4%
η 3
 
1.8%
ε 2
 
1.2%
γ 1
 
0.6%
Ο 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 224261
99.9%
None 170
 
0.1%
Punctuation 43
 
< 0.1%
Number Forms 4
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 20205
 
9.0%
o 15874
 
7.1%
i 14836
 
6.6%
l 13604
 
6.1%
t 12312
 
5.5%
n 12245
 
5.5%
a 11822
 
5.3%
, 11503
 
5.1%
- 10919
 
4.9%
y 10135
 
4.5%
Other values (78) 90806
40.5%
None
ValueCountFrequency (%)
α 87
51.2%
ω 35
20.6%
β 15
 
8.8%
κ 10
 
5.9%
μ 8
 
4.7%
λ 4
 
2.4%
± 3
 
1.8%
η 3
 
1.8%
ε 2
 
1.2%
γ 1
 
0.6%
Other values (2) 2
 
1.2%
Punctuation
ValueCountFrequency (%)
40
93.0%
3
 
7.0%
Number Forms
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

국문
Text

MISSING 

Distinct7086
Distinct (%)99.5%
Missing438
Missing (%)5.8%
Memory size59.2 KiB
2024-01-10T06:27:33.386692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length214
Median length160
Mean length16.848617
Min length1

Characters and Unicode

Total characters119979
Distinct characters539
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7059 ?
Unique (%)99.1%

Sample

1st row아이소프로필아민
2nd row테트라메틸실리케이트
3rd row스트렙토마이신
4th row폼아마이드
5th row칼슘,카바이드
ValueCountFrequency (%)
1:1 29
 
0.4%
22
 
0.3%
나트륨 11
 
0.1%
염화 10
 
0.1%
칼륨 9
 
0.1%
트리페닐술포늄과 8
 
0.1%
노닐페놀류 7
 
0.1%
에테르 7
 
0.1%
1,1,1-트리플루오로-n-[(트리플루오로메틸)술포닐]메탄술폰아미드의 7
 
0.1%
칼슘 7
 
0.1%
Other values (7246) 7404
98.4%
2024-01-10T06:27:33.665621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10029
 
8.4%
, 9471
 
7.9%
7345
 
6.1%
4196
 
3.5%
3240
 
2.7%
3229
 
2.7%
2 3007
 
2.5%
2901
 
2.4%
2313
 
1.9%
1 2266
 
1.9%
Other values (529) 71982
60.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 81956
68.3%
Other Punctuation 10256
 
8.5%
Decimal Number 10044
 
8.4%
Dash Punctuation 10029
 
8.4%
Open Punctuation 2382
 
2.0%
Close Punctuation 2372
 
2.0%
Uppercase Letter 1640
 
1.4%
Lowercase Letter 797
 
0.7%
Space Separator 402
 
0.3%
Math Symbol 80
 
0.1%
Other values (4) 21
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7345
 
9.0%
4196
 
5.1%
3240
 
4.0%
3229
 
3.9%
2901
 
3.5%
2313
 
2.8%
2133
 
2.6%
1957
 
2.4%
1787
 
2.2%
1721
 
2.1%
Other values (427) 51134
62.4%
Lowercase Letter
ValueCountFrequency (%)
t 115
14.4%
e 93
11.7%
r 71
 
8.9%
a 62
 
7.8%
α 56
 
7.0%
o 41
 
5.1%
p 39
 
4.9%
c 37
 
4.6%
n 34
 
4.3%
i 28
 
3.5%
Other values (23) 221
27.7%
Uppercase Letter
ValueCountFrequency (%)
N 534
32.6%
I 287
17.5%
H 125
 
7.6%
C 119
 
7.3%
O 103
 
6.3%
S 66
 
4.0%
R 59
 
3.6%
T 54
 
3.3%
D 44
 
2.7%
E 43
 
2.6%
Other values (14) 206
 
12.6%
Other Punctuation
ValueCountFrequency (%)
, 9471
92.3%
' 350
 
3.4%
: 137
 
1.3%
. 127
 
1.2%
/ 67
 
0.7%
34
 
0.3%
" 18
 
0.2%
; 13
 
0.1%
· 12
 
0.1%
* 11
 
0.1%
Other values (6) 16
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 3007
29.9%
1 2266
22.6%
3 1549
15.4%
4 1512
15.1%
5 712
 
7.1%
6 475
 
4.7%
7 165
 
1.6%
8 143
 
1.4%
9 121
 
1.2%
0 94
 
0.9%
Math Symbol
ValueCountFrequency (%)
+ 44
55.0%
= 24
30.0%
~ 8
 
10.0%
> 2
 
2.5%
< 1
 
1.2%
± 1
 
1.2%
Close Punctuation
ValueCountFrequency (%)
) 1705
71.9%
] 662
 
27.9%
} 5
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 1697
71.2%
[ 682
28.6%
{ 3
 
0.1%
Letter Number
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 10029
100.0%
Space Separator
ValueCountFrequency (%)
402
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 11
100.0%
Final Punctuation
ValueCountFrequency (%)
5
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 81956
68.3%
Common 35582
29.7%
Latin 2326
 
1.9%
Greek 115
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7345
 
9.0%
4196
 
5.1%
3240
 
4.0%
3229
 
3.9%
2901
 
3.5%
2313
 
2.8%
2133
 
2.6%
1957
 
2.4%
1787
 
2.2%
1721
 
2.1%
Other values (427) 51134
62.4%
Latin
ValueCountFrequency (%)
N 534
23.0%
I 287
 
12.3%
H 125
 
5.4%
C 119
 
5.1%
t 115
 
4.9%
O 103
 
4.4%
e 93
 
4.0%
r 71
 
3.1%
S 66
 
2.8%
a 62
 
2.7%
Other values (39) 751
32.3%
Common
ValueCountFrequency (%)
- 10029
28.2%
, 9471
26.6%
2 3007
 
8.5%
1 2266
 
6.4%
) 1705
 
4.8%
( 1697
 
4.8%
3 1549
 
4.4%
4 1512
 
4.2%
5 712
 
2.0%
[ 682
 
1.9%
Other values (33) 2952
 
8.3%
Greek
ValueCountFrequency (%)
α 56
48.7%
ω 21
 
18.3%
β 12
 
10.4%
κ 9
 
7.8%
μ 8
 
7.0%
η 3
 
2.6%
λ 3
 
2.6%
γ 1
 
0.9%
ε 1
 
0.9%
Ο 1
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 81956
68.3%
ASCII 37850
31.5%
None 129
 
0.1%
Punctuation 40
 
< 0.1%
Number Forms 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10029
26.5%
, 9471
25.0%
2 3007
 
7.9%
1 2266
 
6.0%
) 1705
 
4.5%
( 1697
 
4.5%
3 1549
 
4.1%
4 1512
 
4.0%
5 712
 
1.9%
[ 682
 
1.8%
Other values (74) 5220
13.8%
Hangul
ValueCountFrequency (%)
7345
 
9.0%
4196
 
5.1%
3240
 
4.0%
3229
 
3.9%
2901
 
3.5%
2313
 
2.8%
2133
 
2.6%
1957
 
2.4%
1787
 
2.2%
1721
 
2.1%
Other values (427) 51134
62.4%
None
ValueCountFrequency (%)
α 56
43.4%
ω 21
 
16.3%
· 12
 
9.3%
β 12
 
9.3%
κ 9
 
7.0%
μ 8
 
6.2%
η 3
 
2.3%
λ 3
 
2.3%
γ 1
 
0.8%
1
 
0.8%
Other values (3) 3
 
2.3%
Punctuation
ValueCountFrequency (%)
34
85.0%
5
 
12.5%
1
 
2.5%
Number Forms
ValueCountFrequency (%)
3
75.0%
1
 
25.0%

인체_일반증상
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
0
5658 
1
1901 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 5658
74.9%
1 1901
 
25.1%

Length

2024-01-10T06:27:33.767135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:27:33.870347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 5658
74.9%
1 1901
 
25.1%

노출_흡입
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
1
5725 
0
1834 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 5725
75.7%
0 1834
 
24.3%

Length

2024-01-10T06:27:33.973742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:27:34.079406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5725
75.7%
0 1834
 
24.3%

노출_피부
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
1
5661 
0
1898 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 5661
74.9%
0 1898
 
25.1%

Length

2024-01-10T06:27:34.163217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:27:34.242131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5661
74.9%
0 1898
 
25.1%

노출_안구
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
1
5732 
0
1827 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 5732
75.8%
0 1827
 
24.2%

Length

2024-01-10T06:27:34.330373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:27:34.418080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5732
75.8%
0 1827
 
24.2%

노출_경구
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size59.2 KiB
1
5511 
0
2048 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 5511
72.9%
0 2048
 
27.1%

Length

2024-01-10T06:27:34.504742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:27:34.582126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 5511
72.9%
0 2048
 
27.1%

사고노출위험구분수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2451382
Minimum0
Maximum5
Zeros1737
Zeros (%)23.0%
Negative0
Negative (%)0.0%
Memory size66.6 KiB
2024-01-10T06:27:34.649077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.876822
Coefficient of variation (CV)0.57834886
Kurtosis-0.72126133
Mean3.2451382
Median Absolute Deviation (MAD)1
Skewness-0.97698158
Sum24530
Variance3.5224608
MonotonicityNot monotonic
2024-01-10T06:27:34.732381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4 3589
47.5%
5 1863
24.6%
0 1737
23.0%
3 195
 
2.6%
2 99
 
1.3%
1 76
 
1.0%
ValueCountFrequency (%)
0 1737
23.0%
1 76
 
1.0%
2 99
 
1.3%
3 195
 
2.6%
4 3589
47.5%
5 1863
24.6%
ValueCountFrequency (%)
5 1863
24.6%
4 3589
47.5%
3 195
 
2.6%
2 99
 
1.3%
1 76
 
1.0%
0 1737
23.0%

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing7559
Missing (%)100.0%
Memory size66.6 KiB

Interactions

2024-01-10T06:27:31.946268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:27:34.795742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인체_일반증상노출_흡입노출_피부노출_안구노출_경구사고노출위험구분수
인체_일반증상1.0000.4770.4890.4730.4961.000
노출_흡입0.4771.0000.9970.9970.9901.000
노출_피부0.4890.9971.0000.9970.9911.000
노출_안구0.4730.9970.9971.0000.9870.999
노출_경구0.4960.9900.9910.9871.0000.999
사고노출위험구분수1.0001.0001.0000.9990.9991.000
2024-01-10T06:27:35.072882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노출_피부노출_흡입인체_일반증상노출_안구노출_경구
노출_피부1.0000.9520.3250.9490.913
노출_흡입0.9521.0000.3170.9520.911
인체_일반증상0.3250.3171.0000.3140.330
노출_안구0.9490.9520.3141.0000.896
노출_경구0.9130.9110.3300.8961.000
2024-01-10T06:27:35.151061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사고노출위험구분수인체_일반증상노출_흡입노출_피부노출_안구노출_경구
사고노출위험구분수1.0000.9870.9840.9830.9790.973
인체_일반증상0.9871.0000.3170.3250.3140.330
노출_흡입0.9840.3171.0000.9520.9520.911
노출_피부0.9830.3250.9521.0000.9490.913
노출_안구0.9790.3140.9520.9491.0000.896
노출_경구0.9730.3300.9110.9130.8961.000

Missing values

2024-01-10T06:27:32.052518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:27:32.169700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

고유(CAS)번호영문국문인체_일반증상노출_흡입노출_피부노출_안구노출_경구사고노출위험구분수Unnamed: 9
075-31-0isopropylamine아이소프로필아민111115<NA>
1681-84-5Tetramethyl,silicate테트라메틸실리케이트111115<NA>
257-92-1Streptomycin스트렙토마이신011114<NA>
375-12-7Formamide폼아마이드111115<NA>
475-20-7Calcium,carbide칼슘,카바이드111115<NA>
575-29-62-Chloropropane2-클로로프로페인011114<NA>
675-33-2Isopropyl,mercaptan아이소프로필,머캅탄111115<NA>
775-36-5Acetyl,chloride아세틸,클로라이드111115<NA>
875-64-9tert-Butylamine3차-뷰틸아민111115<NA>
975-69-4Trichlorofluoromethane트라이클로로플루오로메테인111115<NA>
고유(CAS)번호영문국문인체_일반증상노출_흡입노출_피부노출_안구노출_경구사고노출위험구분수Unnamed: 9
754923783-42-8Tetraethylene glycol methyl ether테트라에틸렌 글리콜 메틸 에테르001102<NA>
75501559-34-8Tetraethylene glycol butyl ether테트라에틸렌 글리콜 뷰틸 에테르001102<NA>
755125498-49-1Tripropylene glycol methyl ether트리프로필렌 글리콜 메틸 에테르000101<NA>
755288917-22-0Dipropylene glycol methyl ether acetate다이프로필렌 글리콜 메틸 에테르 아세테이트000000<NA>
755392-70-63-Hydroxy-2-naphthoic acid3-하이드록시-2-나프토익산011114<NA>
755498-59-94-Methylbenzenesulfonyl chloride4-메틸벤젠설포닐 클로라이드011114<NA>
7555463-82-1Pentane, all isomers펜탄(모든 이성체)011103<NA>
755668553-00-4Fuel oil, no. 6<NA>011103<NA>
7557754-12-12,3,3,3-Tetrafluoro-1-propene<NA>011103<NA>
7558643-79-8Ο-PhthalaldehydeΟ-프탈알데하이드011114<NA>