Overview

Dataset statistics

Number of variables6
Number of observations35
Missing cells13
Missing cells (%)6.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory53.8 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description부산광역시 지역별 대기오염 측정망 현황에 대한 데이터로 측정망 종류, 측정소명, 측정항목, 측정장비명, 최초 설치연도, 교체연도 항목정보를 제공합니다.
URLhttps://www.data.go.kr/data/3076556/fileData.do

Alerts

최초 설치연도 is highly overall correlated with 측정항목High correlation
교체연도 is highly overall correlated with 측정망 종류 and 2 other fieldsHigh correlation
측정망 종류 is highly overall correlated with 교체연도 and 2 other fieldsHigh correlation
측정항목 is highly overall correlated with 최초 설치연도 and 3 other fieldsHigh correlation
측정장비명 is highly overall correlated with 교체연도 and 2 other fieldsHigh correlation
최초 설치연도 has 3 (8.6%) missing valuesMissing
교체연도 has 10 (28.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 04:37:04.266078
Analysis finished2023-12-12 04:37:05.354164
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정망 종류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size412.0 B
도시대기
27 
중금속
도로변대기

Length

Max length5
Median length4
Mean length3.9428571
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시대기
2nd row도시대기
3rd row도시대기
4th row도시대기
5th row도시대기

Common Values

ValueCountFrequency (%)
도시대기 27
77.1%
중금속 5
 
14.3%
도로변대기 3
 
8.6%

Length

2023-12-12T13:37:05.473870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:37:05.649951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도시대기 27
77.1%
중금속 5
 
14.3%
도로변대기 3
 
8.6%
Distinct32
Distinct (%)91.4%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-12T13:37:05.900141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0285714
Min length2

Characters and Unicode

Total characters106
Distinct characters45
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)82.9%

Sample

1st row광복동
2nd row장림동
3rd row학장동
4th row덕천동
5th row연산동
ValueCountFrequency (%)
덕천동 2
 
5.7%
광안동 2
 
5.7%
학장동 2
 
5.7%
연산동 2
 
5.7%
부곡동 2
 
5.7%
회동동 1
 
2.9%
초량동 1
 
2.9%
온천동 1
 
2.9%
명지동 1
 
2.9%
대신동 1
 
2.9%
Other values (20) 20
57.1%
2023-12-12T13:37:06.368309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33
31.1%
5
 
4.7%
4
 
3.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
Other values (35) 43
40.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 104
98.1%
Space Separator 2
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
31.7%
5
 
4.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (34) 41
39.4%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 104
98.1%
Common 2
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
31.7%
5
 
4.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (34) 41
39.4%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 104
98.1%
ASCII 2
 
1.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
33
31.7%
5
 
4.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (34) 41
39.4%
ASCII
ValueCountFrequency (%)
2
100.0%

측정항목
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size412.0 B
SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
27 
Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,As
NOx, O3, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도

Length

Max length47
Median length47
Mean length44.085714
Min length32

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
2nd rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
3rd rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
4th rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
5th rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도

Common Values

ValueCountFrequency (%)
SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도 27
77.1%
Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,As 5
 
14.3%
NOx, O3, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도 3
 
8.6%

Length

2023-12-12T13:37:06.574091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:37:06.697549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기온 30
9.1%
nox 30
9.1%
o3 30
9.1%
습도 30
9.1%
pm-10 30
9.1%
pm-2.5 30
9.1%
풍향 30
9.1%
풍속 30
9.1%
so2 27
8.2%
co 27
8.2%
Other values (7) 35
10.6%

측정장비명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size412.0 B
HORIBA 370 serise
27 
Thermo iQ seriese
MicroPNS HVS 16
H/V air sampler (SIBATA)
 
2

Length

Max length24
Median length17
Mean length17.228571
Min length15

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHORIBA 370 serise
2nd rowHORIBA 370 serise
3rd rowHORIBA 370 serise
4th rowHORIBA 370 serise
5th rowHORIBA 370 serise

Common Values

ValueCountFrequency (%)
HORIBA 370 serise 27
77.1%
Thermo iQ seriese 3
 
8.6%
MicroPNS HVS 16 3
 
8.6%
H/V air sampler (SIBATA) 2
 
5.7%

Length

2023-12-12T13:37:06.842463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:37:06.961107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
horiba 27
25.2%
370 27
25.2%
serise 27
25.2%
thermo 3
 
2.8%
iq 3
 
2.8%
seriese 3
 
2.8%
micropns 3
 
2.8%
hvs 3
 
2.8%
16 3
 
2.8%
h/v 2
 
1.9%
Other values (3) 6
 
5.6%

최초 설치연도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)68.8%
Missing3
Missing (%)8.6%
Infinite0
Infinite (%)0.0%
Mean2003.1875
Minimum1979
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size447.0 B
2023-12-12T13:37:07.117406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1979
5-th percentile1979.55
Q11996.75
median2002.5
Q32018.25
95-th percentile2020
Maximum2021
Range42
Interquartile range (IQR)21.5

Descriptive statistics

Standard deviation13.250533
Coefficient of variation (CV)0.0066147241
Kurtosis-0.84608512
Mean2003.1875
Median Absolute Deviation (MAD)9
Skewness-0.3043157
Sum64102
Variance175.57661
MonotonicityNot monotonic
2023-12-12T13:37:07.280101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2019 5
14.3%
1999 3
 
8.6%
1996 2
 
5.7%
1997 2
 
5.7%
2020 2
 
5.7%
1979 2
 
5.7%
2011 1
 
2.9%
2006 1
 
2.9%
2007 1
 
2.9%
2021 1
 
2.9%
Other values (12) 12
34.3%
(Missing) 3
 
8.6%
ValueCountFrequency (%)
1979 2
5.7%
1980 1
 
2.9%
1983 1
 
2.9%
1985 1
 
2.9%
1988 1
 
2.9%
1996 2
5.7%
1997 2
5.7%
1999 3
8.6%
2000 1
 
2.9%
2001 1
 
2.9%
ValueCountFrequency (%)
2021 1
 
2.9%
2020 2
 
5.7%
2019 5
14.3%
2018 1
 
2.9%
2012 1
 
2.9%
2011 1
 
2.9%
2007 1
 
2.9%
2006 1
 
2.9%
2005 1
 
2.9%
2004 1
 
2.9%

교체연도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)40.0%
Missing10
Missing (%)28.6%
Infinite0
Infinite (%)0.0%
Mean2015.36
Minimum2010
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size447.0 B
2023-12-12T13:37:07.466292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12013
median2015
Q32018
95-th percentile2022
Maximum2022
Range12
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.7625346
Coefficient of variation (CV)0.0018669293
Kurtosis-0.81642935
Mean2015.36
Median Absolute Deviation (MAD)3
Skewness0.40889114
Sum50384
Variance14.156667
MonotonicityNot monotonic
2023-12-12T13:37:07.580387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2015 4
 
11.4%
2013 4
 
11.4%
2010 3
 
8.6%
2022 3
 
8.6%
2012 3
 
8.6%
2019 2
 
5.7%
2016 2
 
5.7%
2018 2
 
5.7%
2014 1
 
2.9%
2020 1
 
2.9%
(Missing) 10
28.6%
ValueCountFrequency (%)
2010 3
8.6%
2012 3
8.6%
2013 4
11.4%
2014 1
 
2.9%
2015 4
11.4%
2016 2
5.7%
2018 2
5.7%
2019 2
5.7%
2020 1
 
2.9%
2022 3
8.6%
ValueCountFrequency (%)
2022 3
8.6%
2020 1
 
2.9%
2019 2
5.7%
2018 2
5.7%
2016 2
5.7%
2015 4
11.4%
2014 1
 
2.9%
2013 4
11.4%
2012 3
8.6%
2010 3
8.6%

Interactions

2023-12-12T13:37:04.764551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:37:04.571132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:37:04.872845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:37:04.660119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:37:07.690411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
측정망 종류1.0000.0000.9360.6530.5620.908
측정소명0.0001.0000.0000.0000.8860.898
측정항목0.9360.0001.0000.6530.6730.782
측정장비명0.6530.0000.6531.0000.6300.988
최초 설치연도0.5620.8860.6730.6301.0000.466
교체연도0.9080.8980.7820.9880.4661.000
2023-12-12T13:37:07.808606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정항목측정망 종류측정장비명
측정항목1.0000.6890.666
측정망 종류0.6891.0000.666
측정장비명0.6660.6661.000
2023-12-12T13:37:07.917204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초 설치연도교체연도측정망 종류측정항목측정장비명
최초 설치연도1.000-0.0570.4040.5120.256
교체연도-0.0571.0000.7670.5880.741
측정망 종류0.4040.7671.0000.6890.666
측정항목0.5120.5880.6891.0000.666
측정장비명0.2560.7410.6660.6661.000

Missing values

2023-12-12T13:37:05.015522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:37:05.171719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:37:05.274872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
0도시대기광복동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19852010
1도시대기장림동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19792015
2도시대기학장동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19792015
3도시대기덕천동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19882019
4도시대기연산동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19962010
5도시대기대연동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19832014
6도시대기청룡동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19972020
7도시대기전포동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도Thermo iQ seriese19802022
8도시대기태종대SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19962019
9도시대기기장읍SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19992016
측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
25도시대기회동동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise2020<NA>
26도시대기명지동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise2020<NA>
27도로변대기온천동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19972018
28도로변대기초량동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19992018
29도로변대기삼락동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise2021<NA>
30중금속학장동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 16<NA>2012
31중금속광안동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsH/V air sampler (SIBATA)20072013
32중금속덕천동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsH/V air sampler (SIBATA)<NA>2013
33중금속연산동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 1620062012
34중금속부곡동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 16<NA>2012