Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 10000 |
Missing cells | 3416 |
Missing cells (%) | 5.7% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 576.2 KiB |
Average record size in memory | 59.0 B |
Variable types
Text | 2 |
---|---|
Numeric | 2 |
Categorical | 2 |
Dataset
Description | 한국세라믹기술원 세라믹소재정보은행의 소재 조성 정보입니다. |
---|---|
Author | 한국세라믹기술원 |
URL | https://www.data.go.kr/data/15072095/fileData.do |
Reproduction
Analysis started | 2023-12-12 23:33:10.403011 |
---|---|
Analysis finished | 2023-12-12 23:33:11.731526 |
Duration | 1.33 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
소재시퀀스
Text
Distinct | 4639 |
---|---|
Distinct (%) | 46.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 14 |
---|---|
Median length | 14 |
Mean length | 14 |
Min length | 14 |
Characters and Unicode
Total characters | 140000 |
---|---|
Distinct characters | 14 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1922 ? |
---|---|
Unique (%) | 19.2% |
Sample
1st row | MAT-1000008848 |
---|---|
2nd row | MAT-1000008583 |
3rd row | MAT-1000006056 |
4th row | MAT-1000002115 |
5th row | MAT-1000002186 |
Value | Count | Frequency (%) |
mat-1000002834 | 12 | 0.1% |
mat-1000002846 | 11 | 0.1% |
mat-1000006912 | 10 | 0.1% |
mat-1000007806 | 10 | 0.1% |
mat-1000002880 | 10 | 0.1% |
mat-1000007813 | 10 | 0.1% |
mat-1000002863 | 9 | 0.1% |
mat-1000003471 | 9 | 0.1% |
mat-1000002874 | 9 | 0.1% |
mat-1000002843 | 9 | 0.1% |
Other values (4629) | 9901 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 53298 | |
1 | 12918 | 9.2% |
M | 10000 | 7.1% |
A | 10000 | 7.1% |
T | 10000 | 7.1% |
- | 10000 | 7.1% |
8 | 4724 | 3.4% |
2 | 4669 | 3.3% |
7 | 4634 | 3.3% |
6 | 4192 | 3.0% |
Other values (4) | 15565 | 11.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 100000 | |
Uppercase Letter | 30000 | 21.4% |
Dash Punctuation | 10000 | 7.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 53298 | |
1 | 12918 | 12.9% |
8 | 4724 | 4.7% |
2 | 4669 | 4.7% |
7 | 4634 | 4.6% |
6 | 4192 | 4.2% |
3 | 4058 | 4.1% |
4 | 3984 | 4.0% |
9 | 3975 | 4.0% |
5 | 3548 | 3.5% |
Uppercase Letter
Value | Count | Frequency (%) |
M | 10000 | |
A | 10000 | |
T | 10000 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 110000 | |
Latin | 30000 | 21.4% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 53298 | |
1 | 12918 | 11.7% |
- | 10000 | 9.1% |
8 | 4724 | 4.3% |
2 | 4669 | 4.2% |
7 | 4634 | 4.2% |
6 | 4192 | 3.8% |
3 | 4058 | 3.7% |
4 | 3984 | 3.6% |
9 | 3975 | 3.6% |
Latin
Value | Count | Frequency (%) |
M | 10000 | |
A | 10000 | |
T | 10000 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 140000 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 53298 | |
1 | 12918 | 9.2% |
M | 10000 | 7.1% |
A | 10000 | 7.1% |
T | 10000 | 7.1% |
- | 10000 | 7.1% |
8 | 4724 | 3.4% |
2 | 4669 | 3.3% |
7 | 4634 | 3.3% |
6 | 4192 | 3.0% |
Other values (4) | 15565 | 11.1% |
순번
Real number (ℝ)
Distinct | 50 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 5.4351 |
Minimum | 1 |
---|---|
Maximum | 51 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 4 |
Q3 | 7 |
95-th percentile | 13 |
Maximum | 51 |
Range | 50 |
Interquartile range (IQR) | 5 |
Descriptive statistics
Standard deviation | 5.1692998 |
---|---|
Coefficient of variation (CV) | 0.95109561 |
Kurtosis | 21.77003 |
Mean | 5.4351 |
Median Absolute Deviation (MAD) | 3 |
Skewness | 3.7215188 |
Sum | 54351 |
Variance | 26.72166 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 2316 | |
4 | 1838 | |
5 | 1233 | |
7 | 788 | 7.9% |
3 | 782 | 7.8% |
6 | 737 | 7.4% |
9 | 474 | 4.7% |
8 | 464 | 4.6% |
10 | 245 | 2.5% |
2 | 212 | 2.1% |
Other values (40) | 911 | 9.1% |
Value | Count | Frequency (%) |
1 | 2316 | |
2 | 212 | 2.1% |
3 | 782 | 7.8% |
4 | 1838 | |
5 | 1233 | |
6 | 737 | 7.4% |
7 | 788 | 7.9% |
8 | 464 | 4.6% |
9 | 474 | 4.7% |
10 | 245 | 2.5% |
Value | Count | Frequency (%) |
51 | 1 | < 0.1% |
50 | 2 | < 0.1% |
49 | 3 | |
48 | 4 | |
47 | 6 | |
46 | 6 | |
45 | 5 | |
44 | 4 | |
43 | 5 | |
42 | 4 |
원소
Text
MISSING
 
Distinct | 564 |
---|---|
Distinct (%) | 6.0% |
Missing | 603 |
Missing (%) | 6.0% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
tio2 | 928 | 8.6% |
pbo | 499 | 4.6% |
nb2o5 | 490 | 4.5% |
zro2 | 382 | 3.5% |
baco3 | 320 | 3.0% |
bi2o3 | 320 | 3.0% |
cuo | 290 | 2.7% |
zno | 264 | 2.5% |
na2co3 | 259 | 2.4% |
mgo | 225 | 2.1% |
Other values (561) | 6793 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 8475 | 13.5% |
O | 8360 | 13.3% |
2 | 5767 | 9.2% |
3 | 4332 | 6.9% |
C | 4070 | 6.5% |
i | 2838 | 4.5% |
1 | 2416 | 3.8% |
a | 1893 | 3.0% |
N | 1456 | 2.3% |
T | 1431 | 2.3% |
Other values (70) | 21971 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 23794 | |
Uppercase Letter | 22340 | |
Lowercase Letter | 12521 | |
Other Punctuation | 1877 | 3.0% |
Space Separator | 1427 | 2.3% |
Close Punctuation | 301 | 0.5% |
Open Punctuation | 301 | 0.5% |
Dash Punctuation | 243 | 0.4% |
Math Symbol | 195 | 0.3% |
Other Letter | 8 | < 0.1% |
Other values (2) | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
i | 2838 | |
a | 1893 | |
b | 1403 | |
n | 1046 | 8.4% |
r | 1023 | 8.2% |
l | 676 | 5.4% |
e | 658 | 5.3% |
u | 555 | 4.4% |
g | 474 | 3.8% |
o | 366 | 2.9% |
Other values (17) | 1589 |
Uppercase Letter
Value | Count | Frequency (%) |
O | 8360 | |
C | 4070 | |
N | 1456 | 6.5% |
T | 1431 | 6.4% |
B | 1139 | 5.1% |
Z | 874 | 3.9% |
S | 846 | 3.8% |
P | 735 | 3.3% |
M | 724 | 3.2% |
L | 609 | 2.7% |
Other values (14) | 2096 | 9.4% |
Decimal Number
Value | Count | Frequency (%) |
0 | 8475 | |
2 | 5767 | |
3 | 4332 | |
1 | 2416 | 10.2% |
5 | 1406 | 5.9% |
4 | 750 | 3.2% |
9 | 209 | 0.9% |
6 | 168 | 0.7% |
8 | 143 | 0.6% |
7 | 128 | 0.5% |
Other Punctuation
Value | Count | Frequency (%) |
, | 1331 | |
. | 344 | 18.3% |
/ | 146 | 7.8% |
% | 20 | 1.1% |
· | 19 | 1.0% |
: | 10 | 0.5% |
" | 4 | 0.2% |
* | 3 | 0.2% |
Close Punctuation
Value | Count | Frequency (%) |
) | 292 | |
] | 9 | 3.0% |
Open Punctuation
Value | Count | Frequency (%) |
( | 292 | |
[ | 9 | 3.0% |
Math Symbol
Value | Count | Frequency (%) |
+ | 152 | |
~ | 43 | 22.1% |
Space Separator
Value | Count | Frequency (%) |
1427 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 243 |
Other Letter
Value | Count | Frequency (%) |
ㆍ | 8 |
Other Number
Value | Count | Frequency (%) |
₂ | 1 |
Control
Value | Count | Frequency (%) |
1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 34836 | |
Common | 28140 | |
Greek | 25 | < 0.1% |
Hangul | 8 | < 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
O | 8360 | |
C | 4070 | 11.7% |
i | 2838 | 8.1% |
a | 1893 | 5.4% |
N | 1456 | 4.2% |
T | 1431 | 4.1% |
b | 1403 | 4.0% |
B | 1139 | 3.3% |
n | 1046 | 3.0% |
r | 1023 | 2.9% |
Other values (39) | 10177 |
Common
Value | Count | Frequency (%) |
0 | 8475 | |
2 | 5767 | |
3 | 4332 | |
1 | 2416 | 8.6% |
1427 | 5.1% | |
5 | 1406 | 5.0% |
, | 1331 | 4.7% |
4 | 750 | 2.7% |
. | 344 | 1.2% |
) | 292 | 1.0% |
Other values (18) | 1600 | 5.7% |
Greek
Value | Count | Frequency (%) |
α | 22 | |
δ | 3 | 12.0% |
Hangul
Value | Count | Frequency (%) |
ㆍ | 8 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 62956 | |
None | 45 | 0.1% |
Compat Jamo | 8 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 8475 | 13.5% |
O | 8360 | 13.3% |
2 | 5767 | 9.2% |
3 | 4332 | 6.9% |
C | 4070 | 6.5% |
i | 2838 | 4.5% |
1 | 2416 | 3.8% |
a | 1893 | 3.0% |
N | 1456 | 2.3% |
T | 1431 | 2.3% |
Other values (65) | 21918 |
None
Value | Count | Frequency (%) |
α | 22 | |
· | 19 | |
δ | 3 | 6.7% |
₂ | 1 | 2.2% |
Compat Jamo
Value | Count | Frequency (%) |
ㆍ | 8 |
데이터
Real number (ℝ)
MISSING
  SKEWED
  ZEROS
 
Distinct | 771 |
---|---|
Distinct (%) | 10.7% |
Missing | 2813 |
Missing (%) | 28.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 589167.15 |
Minimum | 0 |
---|---|
Maximum | 4.2342342 × 109 |
Zeros | 624 |
Zeros (%) | 6.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0.3 |
median | 1 |
Q3 | 10 |
95-th percentile | 96 |
Maximum | 4.2342342 × 109 |
Range | 4.2342342 × 109 |
Interquartile range (IQR) | 9.7 |
Descriptive statistics
Standard deviation | 49946039 |
---|---|
Coefficient of variation (CV) | 84.773971 |
Kurtosis | 7187 |
Mean | 589167.15 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 84.776176 |
Sum | 4.2343443 × 109 |
Variance | 2.4946068 × 1015 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1.0 | 1070 | 10.7% |
0.0 | 624 | 6.2% |
2.0 | 293 | 2.9% |
50.0 | 287 | 2.9% |
100.0 | 262 | 2.6% |
0.5 | 243 | 2.4% |
5.0 | 213 | 2.1% |
10.0 | 210 | 2.1% |
0.2 | 202 | 2.0% |
0.1 | 158 | 1.6% |
Other values (761) | 3625 | |
(Missing) | 2813 |
Value | Count | Frequency (%) |
0.0 | 624 | |
3.5e-06 | 1 | < 0.1% |
0.0005 | 1 | < 0.1% |
0.001 | 3 | < 0.1% |
0.0018 | 1 | < 0.1% |
0.002 | 3 | < 0.1% |
0.0025 | 6 | 0.1% |
0.004 | 5 | 0.1% |
0.005 | 29 | 0.3% |
0.0055 | 1 | < 0.1% |
Value | Count | Frequency (%) |
4234234234.0 | 1 | < 0.1% |
500.0 | 2 | < 0.1% |
400.0 | 1 | < 0.1% |
200.0 | 6 | 0.1% |
150.0 | 1 | < 0.1% |
100.0 | 262 | |
99.99 | 2 | < 0.1% |
99.95 | 2 | < 0.1% |
99.91 | 3 | < 0.1% |
99.9 | 6 | 0.1% |
단위
Categorical
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
F01002 | |
---|---|
F01001 | |
F01003 | |
<NA> | |
F01004 | 74 |
Length
Max length | 6 |
---|---|
Median length | 6 |
Mean length | 5.6322 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | F01002 |
4th row | F01001 |
5th row | F01001 |
Common Values
Value | Count | Frequency (%) |
F01002 | 3113 | |
F01001 | 2605 | |
F01003 | 2297 | |
<NA> | 1839 | |
F01004 | 74 | 0.7% |
F01006 | 72 | 0.7% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
f01002 | 3113 | |
f01001 | 2605 | |
f01003 | 2297 | |
na | 1839 | |
f01004 | 74 | 0.7% |
f01006 | 72 | 0.7% |
조성구분
Categorical
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1000 | |
---|---|
3000 | |
7000 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1000 |
---|---|
2nd row | 1000 |
3rd row | 3000 |
4th row | 3000 |
5th row | 3000 |
Common Values
Value | Count | Frequency (%) |
1000 | 6822 | |
3000 | 1682 | 16.8% |
7000 | 1496 | 15.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1000 | 6822 | |
3000 | 1682 | 16.8% |
7000 | 1496 | 15.0% |
순번 | 데이터 | 단위 | 조성구분 | |
---|---|---|---|---|
순번 | 1.000 | 0.000 | 0.304 | 0.389 |
데이터 | 0.000 | 1.000 | 0.000 | 0.000 |
단위 | 0.304 | 0.000 | 1.000 | 0.457 |
조성구분 | 0.389 | 0.000 | 0.457 | 1.000 |
조성구분 | 단위 | |
---|---|---|
조성구분 | 1.000 | 0.389 |
단위 | 0.389 | 1.000 |
순번 | 데이터 | 단위 | 조성구분 | |
---|---|---|---|---|
순번 | 1.000 | -0.211 | 0.133 | 0.264 |
데이터 | -0.211 | 1.000 | 0.000 | 0.000 |
단위 | 0.133 | 0.000 | 1.000 | 0.389 |
조성구분 | 0.264 | 0.000 | 0.389 | 1.000 |
소재시퀀스 | 순번 | 원소 | 데이터 | 단위 | 조성구분 | |
---|---|---|---|---|---|---|
10759 | MAT-1000008848 | 6 | K2CO3 | <NA> | <NA> | 1000 |
15003 | MAT-1000008583 | 5 | YAG | 10.0 | <NA> | 1000 |
12865 | MAT-1000006056 | 4 | CdO | 1.2 | F01002 | 3000 |
1146 | MAT-1000002115 | 3 | YO3/2 | 2.0 | F01001 | 3000 |
1097 | MAT-1000002186 | 9 | CuO | 0.1 | F01001 | 3000 |
15940 | MAT-1000006000 | 9 | NiO, WO3, ZrO2, TiO2 | 50.0 | F01001 | 1000 |
6565 | MAT-1000003106 | 1 | B4C | 98.0 | F01002 | 1000 |
13177 | MAT-1000008173 | 7 | Ta2O5 | <NA> | <NA> | 1000 |
2756 | MAT-1000002441 | 4 | Nb2O5 | 1.0 | F01003 | 1000 |
15797 | MAT-1000006920 | 13 | PbO | 1.0 | F01002 | 3000 |
소재시퀀스 | 순번 | 원소 | 데이터 | 단위 | 조성구분 | |
---|---|---|---|---|---|---|
3059 | MAT-1000002626 | 4 | ZrO2 | 0.0 | F01001 | 1000 |
16895 | MAT-1000008464 | 7 | SnCl4.5H2O | 0.005 | F01001 | 1000 |
7777 | MAT-1000003836 | 10 | F127 ((EO)106(PO)70(EO)106) | 0.02 | <NA> | 1000 |
15912 | MAT-1000005936 | 7 | NiO, WO3, ZrO2, TiO2 | 50.0 | F01001 | 1000 |
6772 | MAT-1000003189 | 3 | Eu2+ | 2.0 | F01001 | 3000 |
16354 | MAT-1000007327 | 4 | ZrO2 | <NA> | F01003 | 1000 |
10222 | MAT-1000004508 | 4 | Nb2O5 | 1.0 | F01003 | 1000 |
1702 | MAT-1000002273 | 9 | CuO | 0.1 | F01001 | 3000 |
4000 | MAT-1000002768 | 8 | C010100008 | 3.77 | F01002 | 7000 |
5538 | MAT-1000003016 | 4 | Y2O3 | 5.0 | F01002 | 1000 |