Overview

Dataset statistics

Number of variables11
Number of observations22
Missing cells47
Missing cells (%)19.4%
Duplicate rows1
Duplicate rows (%)4.5%
Total size in memory2.0 KiB
Average record size in memory94.0 B

Variable types

Text1
Unsupported5
Categorical5

Dataset

Description국립환경과학원에서 분석한 국내외 시멘트에 대한 모니터링 결과입니다.국내외 시멘트제품에 대한 중금속(육가크롬, 비소, 납, 구리, 카드뮴, 수은), 방사능(세슘134, 세슘137, 요오드131)에 대한 분석결과를 포함하고 있습니다.중금속분석결과의 단위는 mg/kg 이며, 방사능분석결과의 단위는 bq/g 입니다.
Author환경부 국립환경과학원
URLhttps://www.data.go.kr/data/15048247/fileData.do

Alerts

Dataset has 1 (4.5%) duplicate rowsDuplicates
Unnamed: 7 is highly overall correlated with Unnamed: 10High correlation
Unnamed: 3 is highly overall correlated with Unnamed: 10High correlation
Unnamed: 9 is highly overall correlated with Unnamed: 10High correlation
Unnamed: 8 is highly overall correlated with Unnamed: 10High correlation
Unnamed: 10 is highly overall correlated with Unnamed: 3 and 3 other fieldsHigh correlation
국내ㆍ외 시멘트 중금속 분석결과 has 2 (9.1%) missing valuesMissing
Unnamed: 1 has 9 (40.9%) missing valuesMissing
Unnamed: 2 has 9 (40.9%) missing valuesMissing
Unnamed: 4 has 9 (40.9%) missing valuesMissing
Unnamed: 5 has 9 (40.9%) missing valuesMissing
Unnamed: 6 has 9 (40.9%) missing valuesMissing
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-16 15:27:40.162164
Analysis finished2023-12-16 15:27:43.365277
Duration3.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct20
Distinct (%)100.0%
Missing2
Missing (%)9.1%
Memory size308.0 B
2023-12-16T15:27:43.914077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length86
Median length63
Mean length18.1
Min length2

Characters and Unicode

Total characters362
Distinct characters149
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)100.0%

Sample

1st row대상년월 : 2022년 1월
2nd row□ 최종 분석결과(정량한계 반영)
3rd row구분
4th row현대(영월)
5th row현대(단양)
ValueCountFrequency (%)
6
 
8.6%
3
 
4.3%
20 2
 
2.9%
미만 2
 
2.9%
성신(단양 1
 
1.4%
1
 
1.4%
방사능(134cs 1
 
1.4%
137cs 1
 
1.4%
131i)은 1
 
1.4%
한국인정기구 1
 
1.4%
Other values (51) 51
72.9%
2023-12-16T15:27:45.235075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56
 
15.5%
( 17
 
4.7%
) 17
 
4.7%
m 6
 
1.7%
2 6
 
1.7%
5
 
1.4%
i 5
 
1.4%
5
 
1.4%
t 5
 
1.4%
1 5
 
1.4%
Other values (139) 235
64.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 177
48.9%
Space Separator 56
 
15.5%
Lowercase Letter 44
 
12.2%
Decimal Number 22
 
6.1%
Open Punctuation 17
 
4.7%
Close Punctuation 17
 
4.7%
Uppercase Letter 12
 
3.3%
Other Punctuation 11
 
3.0%
Dash Punctuation 2
 
0.6%
Other Symbol 1
 
0.3%
Other values (3) 3
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.3%
4
 
2.3%
4
 
2.3%
3
 
1.7%
3
 
1.7%
3
 
1.7%
3
 
1.7%
Other values (96) 138
78.0%
Lowercase Letter
ValueCountFrequency (%)
m 6
13.6%
i 5
11.4%
t 5
11.4%
g 4
9.1%
s 3
 
6.8%
a 3
 
6.8%
e 3
 
6.8%
k 3
 
6.8%
u 2
 
4.5%
o 2
 
4.5%
Other values (7) 8
18.2%
Decimal Number
ValueCountFrequency (%)
2 6
27.3%
1 5
22.7%
0 5
22.7%
3 3
13.6%
7 1
 
4.5%
4 1
 
4.5%
9 1
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
C 3
25.0%
M 2
16.7%
A 2
16.7%
D 2
16.7%
I 1
 
8.3%
S 1
 
8.3%
O 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
: 4
36.4%
3
27.3%
, 2
18.2%
/ 2
18.2%
Space Separator
ValueCountFrequency (%)
56
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 177
48.9%
Common 128
35.4%
Latin 57
 
15.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.3%
4
 
2.3%
4
 
2.3%
3
 
1.7%
3
 
1.7%
3
 
1.7%
3
 
1.7%
Other values (96) 138
78.0%
Latin
ValueCountFrequency (%)
m 6
 
10.5%
i 5
 
8.8%
t 5
 
8.8%
g 4
 
7.0%
C 3
 
5.3%
s 3
 
5.3%
a 3
 
5.3%
e 3
 
5.3%
k 3
 
5.3%
u 2
 
3.5%
Other values (15) 20
35.1%
Common
ValueCountFrequency (%)
56
43.8%
( 17
 
13.3%
) 17
 
13.3%
2 6
 
4.7%
1 5
 
3.9%
0 5
 
3.9%
: 4
 
3.1%
3
 
2.3%
3 3
 
2.3%
, 2
 
1.6%
Other values (8) 10
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 178
49.2%
Hangul 177
48.9%
Punctuation 5
 
1.4%
Geometric Shapes 1
 
0.3%
Number Forms 1
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
56
31.5%
( 17
 
9.6%
) 17
 
9.6%
m 6
 
3.4%
2 6
 
3.4%
i 5
 
2.8%
t 5
 
2.8%
1 5
 
2.8%
0 5
 
2.8%
: 4
 
2.2%
Other values (28) 52
29.2%
Hangul
ValueCountFrequency (%)
5
 
2.8%
5
 
2.8%
5
 
2.8%
4
 
2.3%
4
 
2.3%
4
 
2.3%
3
 
1.7%
3
 
1.7%
3
 
1.7%
3
 
1.7%
Other values (96) 138
78.0%
Punctuation
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

Unnamed: 1
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)40.9%
Memory size308.0 B

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)40.9%
Memory size308.0 B

Unnamed: 3
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
불검출
12 
<NA>
Cd
 
1

Length

Max length4
Median length3
Mean length3.3636364
Min length2

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th rowCd
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 12
54.5%
<NA> 9
40.9%
Cd 1
 
4.5%

Length

2023-12-16T15:27:45.853638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:27:46.359112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 12
54.5%
na 9
40.9%
cd 1
 
4.5%

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)40.9%
Memory size308.0 B

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)40.9%
Memory size308.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)40.9%
Memory size308.0 B

Unnamed: 7
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
결정준위 미만
12 
<NA>
134Cs
 
1

Length

Max length7
Median length7
Mean length5.6818182
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row134Cs
5th row결정준위 미만

Common Values

ValueCountFrequency (%)
결정준위 미만 12
54.5%
<NA> 9
40.9%
134Cs 1
 
4.5%

Length

2023-12-16T15:27:46.878021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:27:47.300023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
결정준위 12
35.3%
미만 12
35.3%
na 9
26.5%
134cs 1
 
2.9%

Unnamed: 8
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
결정준위 미만
12 
<NA>
137Cs
 
1

Length

Max length7
Median length7
Mean length5.6818182
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row137Cs
5th row결정준위 미만

Common Values

ValueCountFrequency (%)
결정준위 미만 12
54.5%
<NA> 9
40.9%
137Cs 1
 
4.5%

Length

2023-12-16T15:27:48.539367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:27:50.057913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
결정준위 12
35.3%
미만 12
35.3%
na 9
26.5%
137cs 1
 
2.9%

Unnamed: 9
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size308.0 B
결정준위 미만
12 
<NA>
131I
 
1

Length

Max length7
Median length7
Mean length5.6363636
Min length4

Unique

Unique1 ?
Unique (%)4.5%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row131I
5th row결정준위 미만

Common Values

ValueCountFrequency (%)
결정준위 미만 12
54.5%
<NA> 9
40.9%
131I 1
 
4.5%

Length

2023-12-16T15:27:51.460961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:27:52.272525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
결정준위 12
35.3%
미만 12
35.3%
na 9
26.5%
131i 1
 
2.9%

Unnamed: 10
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size308.0 B
국내산
11 
<NA>
(단위 : mg/kg, Bq/g)
 
1
비고
 
1
수입산
 
1

Length

Max length18
Median length3
Mean length4
Min length2

Unique

Unique3 ?
Unique (%)13.6%

Sample

1st row<NA>
2nd row<NA>
3rd row(단위 : mg/kg, Bq/g)
4th row비고
5th row국내산

Common Values

ValueCountFrequency (%)
국내산 11
50.0%
<NA> 8
36.4%
(단위 : mg/kg, Bq/g) 1
 
4.5%
비고 1
 
4.5%
수입산 1
 
4.5%

Length

2023-12-16T15:27:53.292472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:27:54.100487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내산 11
44.0%
na 8
32.0%
단위 1
 
4.0%
1
 
4.0%
mg/kg 1
 
4.0%
bq/g 1
 
4.0%
비고 1
 
4.0%
수입산 1
 
4.0%

Correlations

2023-12-16T15:27:54.672315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국내ㆍ외 시멘트 중금속 분석결과Unnamed: 3Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10
국내ㆍ외 시멘트 중금속 분석결과1.0001.0001.0001.0001.0001.000
Unnamed: 31.0001.0000.5620.5620.5621.000
Unnamed: 71.0000.5621.0000.5620.5621.000
Unnamed: 81.0000.5620.5621.0000.5621.000
Unnamed: 91.0000.5620.5620.5621.0001.000
Unnamed: 101.0001.0001.0001.0001.0001.000
2023-12-16T15:27:55.504979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 7Unnamed: 3Unnamed: 9Unnamed: 8Unnamed: 10
Unnamed: 71.0000.3720.3720.3720.953
Unnamed: 30.3721.0000.3720.3720.953
Unnamed: 90.3720.3721.0000.3720.953
Unnamed: 80.3720.3720.3721.0000.953
Unnamed: 100.9530.9530.9530.9531.000
2023-12-16T15:27:56.017363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 3Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10
Unnamed: 31.0000.3720.3720.3720.953
Unnamed: 70.3721.0000.3720.3720.953
Unnamed: 80.3720.3721.0000.3720.953
Unnamed: 90.3720.3720.3721.0000.953
Unnamed: 100.9530.9530.9530.9531.000

Missing values

2023-12-16T15:27:41.197224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T15:27:41.993740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-16T15:27:42.658783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

국내ㆍ외 시멘트 중금속 분석결과Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10
0대상년월 : 2022년 1월NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
1<NA>NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
2□ 최종 분석결과(정량한계 반영)NaNNaN<NA>NaNNaNNaN<NA><NA><NA>(단위 : mg/kg, Bq/g)
3구분Cr6+AsCdCuHgPb134Cs137Cs131I비고
4현대(영월)5.443.497불검출119.232불검출27.53결정준위 미만결정준위 미만결정준위 미만국내산
5현대(단양)8.254.459불검출141.714불검출43.11결정준위 미만결정준위 미만결정준위 미만국내산
6아세아(제천)5.1312.639불검출261.201불검출83.08결정준위 미만결정준위 미만결정준위 미만국내산
7삼표(삼척)17.5412.873불검출266.6670.081694.05결정준위 미만결정준위 미만결정준위 미만국내산
8쌍용(동해)10.3614.541불검출263.795불검출78.28결정준위 미만결정준위 미만결정준위 미만국내산
9쌍용(영월)6.0712.942불검출147.579불검출28결정준위 미만결정준위 미만결정준위 미만국내산
국내ㆍ외 시멘트 중금속 분석결과Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10
12한라(옥계)9.2512.791불검출264.8860.093656101.29결정준위 미만결정준위 미만결정준위 미만국내산
13고려(장성)5.945.746불검출29.533불검출42.44결정준위 미만결정준위 미만결정준위 미만국내산
14유니온(청주)불검출4.709불검출불검출불검출29.47결정준위 미만결정준위 미만결정준위 미만국내산
15Sumitomo Osaka7.6223.719불검출336.401불검출74.57결정준위 미만결정준위 미만결정준위 미만수입산
16<NA>NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
17※ Cr(Ⅵ)의 자율협약기준은 2009년부터 20 mg/kg임(일본의 시멘트업계 자율관리기준 : 20 mg/kg)NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
18※ 정량한계 이하는 “불검출”로 표기하였음NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
19※ 방사능(134Cs, 137Cs, 131I)은 한국인정기구 공인시험기관에서 분석NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
20- 결정준위 미만 : 검출되지 않을 확률이 큼NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>
21- MDA 미만 : 최소검출가능농도(Minimum Detectable Activity) 미만으로 수치가 매우 낮아(환경 준위) 정량화 할 수 없는 값NaNNaN<NA>NaNNaNNaN<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

국내ㆍ외 시멘트 중금속 분석결과Unnamed: 3Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10# duplicates
0<NA><NA><NA><NA><NA><NA>2