Overview

Dataset statistics

Number of variables20
Number of observations88
Missing cells437
Missing cells (%)24.8%
Duplicate rows2
Duplicate rows (%)2.3%
Total size in memory13.9 KiB
Average record size in memory161.5 B

Variable types

Unsupported14
Text2
Categorical4

Dataset

Description우리나라에서 가까운 바다에 방사성물질이 어느 정도로 분포하고 있는지 주기적으로 모니터링한 측정결과 원본자료 입니다.
URLhttps://www.data.go.kr/data/15120894/fileData.do

Alerts

Dataset has 2 (2.3%) duplicate rowsDuplicates
생태구 is highly overall correlated with Unnamed: 15 and 1 other fieldsHigh correlation
Unnamed: 15 is highly overall correlated with 생태구 and 2 other fieldsHigh correlation
Unnamed: 10 is highly overall correlated with Unnamed: 15 and 1 other fieldsHigh correlation
Unnamed: 18 is highly overall correlated with 생태구 and 2 other fieldsHigh correlation
생태구 is highly imbalanced (60.6%)Imbalance
조사년월 has 2 (2.3%) missing valuesMissing
Unnamed: 1 has 4 (4.5%) missing valuesMissing
구분 has 83 (94.3%) missing valuesMissing
정점 has 8 (9.1%) missing valuesMissing
분석 결과 has 5 (5.7%) missing valuesMissing
Unnamed: 6 has 5 (5.7%) missing valuesMissing
Unnamed: 7 has 31 (35.2%) missing valuesMissing
Unnamed: 8 has 33 (37.5%) missing valuesMissing
Unnamed: 9 has 15 (17.0%) missing valuesMissing
Unnamed: 11 has 21 (23.9%) missing valuesMissing
Unnamed: 12 has 22 (25.0%) missing valuesMissing
Unnamed: 13 has 22 (25.0%) missing valuesMissing
Unnamed: 14 has 21 (23.9%) missing valuesMissing
Unnamed: 16 has 55 (62.5%) missing valuesMissing
Unnamed: 17 has 55 (62.5%) missing valuesMissing
Unnamed: 19 has 55 (62.5%) missing valuesMissing
조사년월 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
분석 결과 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 17 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 19 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 13:24:00.777462
Analysis finished2023-12-12 13:24:02.246996
Duration1.47 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

조사년월
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)2.3%
Memory size836.0 B

Unnamed: 1
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4
Missing (%)4.5%
Memory size836.0 B

구분
Text

MISSING 

Distinct4
Distinct (%)80.0%
Missing83
Missing (%)94.3%
Memory size836.0 B
2023-12-12T22:24:02.324669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length3
Min length2

Characters and Unicode

Total characters15
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st row해수
2nd row구분
3rd row해저퇴적물
4th row구분
5th row해양생물
ValueCountFrequency (%)
구분 2
40.0%
해수 1
20.0%
해저퇴적물 1
20.0%
해양생물 1
20.0%
2023-12-12T22:24:02.575222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%

생태구
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size836.0 B
<NA>
72 
대한해협
 
3
동해
 
3
제주
 
3
서해중부
 
2
Other values (3)
 
5

Length

Max length6
Median length4
Mean length3.8636364
Min length2

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row<NA>
2nd row<NA>
3rd row서해중부
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 72
81.8%
대한해협 3
 
3.4%
동해 3
 
3.4%
제주 3
 
3.4%
서해중부 2
 
2.3%
서남해역 2
 
2.3%
생태구 2
 
2.3%
주요 항만 1
 
1.1%

Length

2023-12-12T22:24:02.722286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:24:02.860158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 72
80.9%
대한해협 3
 
3.4%
동해 3
 
3.4%
제주 3
 
3.4%
서해중부 2
 
2.2%
서남해역 2
 
2.2%
생태구 2
 
2.2%
주요 1
 
1.1%
항만 1
 
1.1%

정점
Text

MISSING 

Distinct47
Distinct (%)58.8%
Missing8
Missing (%)9.1%
Memory size836.0 B
2023-12-12T22:24:03.085827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.55
Min length2

Characters and Unicode

Total characters284
Distinct characters70
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)17.5%

Sample

1st row인천2
2nd row인천13
3rd row아산1
4th row가로림3
5th row태안3
ValueCountFrequency (%)
보령4 2
 
2.5%
기장3 2
 
2.5%
거제도동안3 2
 
2.5%
감포4 2
 
2.5%
부산5 2
 
2.5%
울산2 2
 
2.5%
고흥5 2
 
2.5%
축산2 2
 
2.5%
통영외안5 2
 
2.5%
영일만11 2
 
2.5%
Other values (37) 60
75.0%
2023-12-12T22:24:03.481411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 21
 
7.4%
19
 
6.7%
1 19
 
6.7%
3 17
 
6.0%
16
 
5.6%
11
 
3.9%
9
 
3.2%
8
 
2.8%
4 8
 
2.8%
7
 
2.5%
Other values (60) 149
52.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 206
72.5%
Decimal Number 75
 
26.4%
Uppercase Letter 3
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
9.2%
16
 
7.8%
11
 
5.3%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
Other values (52) 113
54.9%
Decimal Number
ValueCountFrequency (%)
2 21
28.0%
1 19
25.3%
3 17
22.7%
4 8
 
10.7%
5 6
 
8.0%
9 3
 
4.0%
6 1
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
H 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 206
72.5%
Common 75
 
26.4%
Latin 3
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
9.2%
16
 
7.8%
11
 
5.3%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
Other values (52) 113
54.9%
Common
ValueCountFrequency (%)
2 21
28.0%
1 19
25.3%
3 17
22.7%
4 8
 
10.7%
5 6
 
8.0%
9 3
 
4.0%
6 1
 
1.3%
Latin
ValueCountFrequency (%)
H 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 206
72.5%
ASCII 78
 
27.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 21
26.9%
1 19
24.4%
3 17
21.8%
4 8
 
10.3%
5 6
 
7.7%
9 3
 
3.8%
H 3
 
3.8%
6 1
 
1.3%
Hangul
ValueCountFrequency (%)
19
 
9.2%
16
 
7.8%
11
 
5.3%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
6
 
2.9%
5
 
2.4%
Other values (52) 113
54.9%

분석 결과
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)5.7%
Memory size836.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing5
Missing (%)5.7%
Memory size836.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing31
Missing (%)35.2%
Memory size836.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing33
Missing (%)37.5%
Memory size836.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing15
Missing (%)17.0%
Memory size836.0 B

Unnamed: 10
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size836.0 B
±
33 
<NA>
14 
<2.78
<2.81
<2.80
Other values (14)
24 

Length

Max length6
Median length5
Mean length3.3181818
Min length1

Unique

Unique7 ?
Unique (%)8.0%

Sample

1st row3H
2nd row(Bq/L)
3rd row<2.83
4th row<2.76
5th row<2.78

Common Values

ValueCountFrequency (%)
± 33
37.5%
<NA> 14
15.9%
<2.78 6
 
6.8%
<2.81 6
 
6.8%
<2.80 5
 
5.7%
<2.84 4
 
4.5%
<2.82 3
 
3.4%
<2.83 2
 
2.3%
<2.76 2
 
2.3%
<2.79 2
 
2.3%
Other values (9) 11
 
12.5%

Length

2023-12-12T22:24:03.647330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
± 33
37.5%
na 14
15.9%
2.78 6
 
6.8%
2.81 6
 
6.8%
2.80 5
 
5.7%
2.84 4
 
4.5%
2.82 3
 
3.4%
2.86 2
 
2.3%
2.75 2
 
2.3%
2.79 2
 
2.3%
Other values (9) 11
 
12.5%

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing21
Missing (%)23.9%
Memory size836.0 B

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing22
Missing (%)25.0%
Memory size836.0 B

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing22
Missing (%)25.0%
Memory size836.0 B

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing21
Missing (%)23.9%
Memory size836.0 B

Unnamed: 15
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size836.0 B
<NA>
55 
±
33 

Length

Max length4
Median length4
Mean length2.875
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row±
3rd row±
4th row±
5th row±

Common Values

ValueCountFrequency (%)
<NA> 55
62.5%
± 33
37.5%

Length

2023-12-12T22:24:03.775643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:24:03.962464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 55
62.5%
± 33
37.5%

Unnamed: 16
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing55
Missing (%)62.5%
Memory size836.0 B

Unnamed: 17
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing55
Missing (%)62.5%
Memory size836.0 B

Unnamed: 18
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size836.0 B
<NA>
55 
±
33 

Length

Max length4
Median length4
Mean length2.875
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row±
3rd row±
4th row±
5th row±

Common Values

ValueCountFrequency (%)
<NA> 55
62.5%
± 33
37.5%

Length

2023-12-12T22:24:04.082956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:24:04.194387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 55
62.5%
± 33
37.5%

Unnamed: 19
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing55
Missing (%)62.5%
Memory size836.0 B

Correlations

2023-12-12T22:24:04.270203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분생태구정점Unnamed: 10
구분1.0001.0001.0000.000
생태구1.0001.0001.0000.657
정점1.0001.0001.0000.471
Unnamed: 100.0000.6570.4711.000
2023-12-12T22:24:04.362419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생태구Unnamed: 15Unnamed: 10Unnamed: 18
생태구1.0001.0000.0001.000
Unnamed: 151.0001.0001.0001.000
Unnamed: 100.0001.0001.0001.000
Unnamed: 181.0001.0001.0001.000
2023-12-12T22:24:04.452739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생태구Unnamed: 10Unnamed: 15Unnamed: 18
생태구1.0000.0001.0001.000
Unnamed: 100.0001.0001.0001.000
Unnamed: 151.0001.0001.0001.000
Unnamed: 181.0001.0001.0001.000

Missing values

2023-12-12T22:24:01.132166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:24:01.416159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:24:01.704105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

조사년월Unnamed: 1구분생태구정점분석 결과Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 17Unnamed: 18Unnamed: 19
0<NA><NA><NA>시료채취일수심137CsNaNNaN3H전베타NaNNaN239+240Pu<NA>NaN240Pu/239Pu<NA>NaN
120212<NA><NA><NA>NaNNaNmBq/kg±불확도(Bq/L)Bq/L±불확도μBq/kg±불확도NaN±불확도
220212해수서해중부인천22021-02-20 00:00:00표층<1.1NaNNaN<2.839.8±2.44.23±0.10.204±0.011
320212<NA><NA>인천132021-02-19 00:00:00표층1.05±0.18<2.769.4±2.15.43±0.280.19±0.016
420212<NA><NA>아산12021-02-22 00:00:00표층1.39±0.2<2.7810.2±2.22.94±0.10.1963±0.0076
520212<NA><NA>가로림32021-02-25 00:00:00표층1.29±0.2<2.8010.1±2.34.59±0.120.207±0.01
620212<NA><NA>태안32021-02-25 00:00:00표층0.86±0.16<2.8010.4±2.43.55±0.120.189±0.011
720212<NA><NA>보령42021-02-27 00:00:00표층1.26±0.19<2.799.8±2.25.48±0.110.1916±0.0071
820212<NA><NA>군산92021-02-27 00:00:00표층1.29±0.2<2.8110.2±2.36.58±0.140.1791±0.0069
920212<NA><NA>전주포22021-02-28 00:00:00표층1.63±0.23<2.8410.2±2.33.65±0.0730.187±0.011
조사년월Unnamed: 1구분생태구정점분석 결과Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 17Unnamed: 18Unnamed: 19
78조사년월NaN구분생태구정점분석 결과NaNNaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
79<NA><NA><NA>시료채취일137CsNaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8020212<NA><NA><NA>NaNmBq/kg±불확도NaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8120212해양생물대한해협거제도동안2021-02-28 00:00:00<38.4NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8220212<NA><NA>부산연안2021-02-28 00:00:00<36.6NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8320212<NA><NA>울산연안2021-02-28 00:00:00<35.7NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8420212<NA><NA>구룡포연안2021-02-28 00:00:00<25.4NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8520212<NA>동해후포연안2021-02-28 00:00:00<35.4NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8620212<NA><NA>속초연안2021-02-28 00:00:00<20.1NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN
8720212<NA>제주제주연안2021-02-28 00:00:00<36.9NaNNaNNaN<NA>NaNNaNNaNNaN<NA>NaNNaN<NA>NaN

Duplicate rows

Most frequently occurring

구분생태구정점Unnamed: 10Unnamed: 15Unnamed: 18# duplicates
1<NA><NA><NA><NA><NA><NA>5
0구분생태구정점<NA><NA><NA>2