Overview

Dataset statistics

Number of variables6
Number of observations36
Missing cells13
Missing cells (%)6.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory53.7 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description부산광역시 지역별 대기오염 측정망 현황에 대한 데이터로 측정망 종류, 측정소명, 측정항목, 측정장비명, 최초 설치연도, 교체연도 항목정보를 제공합니다.
Author부산광역시
URLhttps://www.data.go.kr/data/3076556/fileData.do

Alerts

최초 설치연도 is highly overall correlated with 측정망 종류High correlation
교체연도 is highly overall correlated with 측정망 종류 and 2 other fieldsHigh correlation
측정망 종류 is highly overall correlated with 최초 설치연도 and 3 other fieldsHigh correlation
측정항목 is highly overall correlated with 교체연도 and 2 other fieldsHigh correlation
측정장비명 is highly overall correlated with 교체연도 and 2 other fieldsHigh correlation
최초 설치연도 has 3 (8.3%) missing valuesMissing
교체연도 has 10 (27.8%) missing valuesMissing

Reproduction

Analysis started2024-04-29 23:23:37.338769
Analysis finished2024-04-29 23:23:39.754541
Duration2.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정망 종류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size420.0 B
도시대기
27 
중금속
도로변대기

Length

Max length5
Median length4
Mean length3.9722222
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시대기
2nd row도시대기
3rd row도시대기
4th row도시대기
5th row도시대기

Common Values

ValueCountFrequency (%)
도시대기 27
75.0%
중금속 5
 
13.9%
도로변대기 4
 
11.1%

Length

2024-04-30T08:23:39.826341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T08:23:39.931320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도시대기 27
75.0%
중금속 5
 
13.9%
도로변대기 4
 
11.1%
Distinct33
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Memory size420.0 B
2024-04-30T08:23:40.092166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3
Min length2

Characters and Unicode

Total characters108
Distinct characters46
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)83.3%

Sample

1st row광복동
2nd row장림동
3rd row학장동
4th row덕천동
5th row연산동
ValueCountFrequency (%)
덕천동 2
 
5.6%
광안동 2
 
5.6%
학장동 2
 
5.6%
연산동 2
 
5.6%
부곡동 2
 
5.6%
회동동 1
 
2.8%
초량동 1
 
2.8%
온천동 1
 
2.8%
명지동 1
 
2.8%
대신동 1
 
2.8%
Other values (21) 21
58.3%
2024-04-30T08:23:40.410886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
31.5%
5
 
4.6%
4
 
3.7%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
Other values (36) 44
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 106
98.1%
Space Separator 2
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
32.1%
5
 
4.7%
4
 
3.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
Other values (35) 42
39.6%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 106
98.1%
Common 2
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
32.1%
5
 
4.7%
4
 
3.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
Other values (35) 42
39.6%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 106
98.1%
ASCII 2
 
1.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
34
32.1%
5
 
4.7%
4
 
3.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
3
 
2.8%
Other values (35) 42
39.6%
ASCII
ValueCountFrequency (%)
2
100.0%

측정항목
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size420.0 B
SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
27 
Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,As
NOx, O3, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
SO2, NOx, O3, CO, PM-10, PM-2.6, 풍향, 풍속, 기온, 습도
 
1

Length

Max length47
Median length47
Mean length44.166667
Min length32

Unique

Unique1 ?
Unique (%)2.8%

Sample

1st rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
2nd rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
3rd rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
4th rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도
5th rowSO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도

Common Values

ValueCountFrequency (%)
SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도 27
75.0%
Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,As 5
 
13.9%
NOx, O3, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도 3
 
8.3%
SO2, NOx, O3, CO, PM-10, PM-2.6, 풍향, 풍속, 기온, 습도 1
 
2.8%

Length

2024-04-30T08:23:40.548124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T08:23:40.659124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
습도 31
9.1%
기온 31
9.1%
o3 31
9.1%
nox 31
9.1%
pm-10 31
9.1%
풍향 31
9.1%
풍속 31
9.1%
pm-2.5 30
8.8%
so2 28
8.3%
co 28
8.3%
Other values (8) 36
10.6%

측정장비명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size420.0 B
HORIBA 370 serise
28 
Thermo iQ seriese
MicroPNS HVS 16
H/V air sampler (SIBATA)
 
2

Length

Max length24
Median length17
Mean length17.222222
Min length15

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHORIBA 370 serise
2nd rowHORIBA 370 serise
3rd rowHORIBA 370 serise
4th rowHORIBA 370 serise
5th rowHORIBA 370 serise

Common Values

ValueCountFrequency (%)
HORIBA 370 serise 28
77.8%
Thermo iQ seriese 3
 
8.3%
MicroPNS HVS 16 3
 
8.3%
H/V air sampler (SIBATA) 2
 
5.6%

Length

2024-04-30T08:23:40.800841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T08:23:40.905170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
horiba 28
25.5%
370 28
25.5%
serise 28
25.5%
thermo 3
 
2.7%
iq 3
 
2.7%
seriese 3
 
2.7%
micropns 3
 
2.7%
hvs 3
 
2.7%
16 3
 
2.7%
h/v 2
 
1.8%
Other values (3) 6
 
5.5%

최초 설치연도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct23
Distinct (%)69.7%
Missing3
Missing (%)8.3%
Infinite0
Infinite (%)0.0%
Mean2003.7879
Minimum1979
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size456.0 B
2024-04-30T08:23:41.012113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1979
5-th percentile1979.6
Q11997
median2003
Q32019
95-th percentile2020.4
Maximum2023
Range44
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.490176
Coefficient of variation (CV)0.0067323374
Kurtosis-0.86673932
Mean2003.7879
Median Absolute Deviation (MAD)9
Skewness-0.31565206
Sum66125
Variance181.98485
MonotonicityNot monotonic
2024-04-30T08:23:41.115614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2019 5
 
13.9%
1999 3
 
8.3%
1996 2
 
5.6%
1997 2
 
5.6%
2020 2
 
5.6%
1979 2
 
5.6%
2011 1
 
2.8%
2006 1
 
2.8%
2007 1
 
2.8%
2023 1
 
2.8%
Other values (13) 13
36.1%
(Missing) 3
 
8.3%
ValueCountFrequency (%)
1979 2
5.6%
1980 1
 
2.8%
1983 1
 
2.8%
1985 1
 
2.8%
1988 1
 
2.8%
1996 2
5.6%
1997 2
5.6%
1999 3
8.3%
2000 1
 
2.8%
2001 1
 
2.8%
ValueCountFrequency (%)
2023 1
 
2.8%
2021 1
 
2.8%
2020 2
 
5.6%
2019 5
13.9%
2018 1
 
2.8%
2012 1
 
2.8%
2011 1
 
2.8%
2007 1
 
2.8%
2006 1
 
2.8%
2005 1
 
2.8%

교체연도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct11
Distinct (%)42.3%
Missing10
Missing (%)27.8%
Infinite0
Infinite (%)0.0%
Mean2015.6538
Minimum2010
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size456.0 B
2024-04-30T08:23:41.215215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12013
median2015
Q32018.75
95-th percentile2022
Maximum2023
Range13
Interquartile range (IQR)5.75

Descriptive statistics

Standard deviation3.9793699
Coefficient of variation (CV)0.0019742328
Kurtosis-0.91103063
Mean2015.6538
Median Absolute Deviation (MAD)3
Skewness0.39691497
Sum52407
Variance15.835385
MonotonicityNot monotonic
2024-04-30T08:23:41.338245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2015 4
 
11.1%
2013 4
 
11.1%
2010 3
 
8.3%
2022 3
 
8.3%
2012 3
 
8.3%
2019 2
 
5.6%
2016 2
 
5.6%
2018 2
 
5.6%
2014 1
 
2.8%
2020 1
 
2.8%
(Missing) 10
27.8%
ValueCountFrequency (%)
2010 3
8.3%
2012 3
8.3%
2013 4
11.1%
2014 1
 
2.8%
2015 4
11.1%
2016 2
5.6%
2018 2
5.6%
2019 2
5.6%
2020 1
 
2.8%
2022 3
8.3%
ValueCountFrequency (%)
2023 1
 
2.8%
2022 3
8.3%
2020 1
 
2.8%
2019 2
5.6%
2018 2
5.6%
2016 2
5.6%
2015 4
11.1%
2014 1
 
2.8%
2013 4
11.1%
2012 3
8.3%

Interactions

2024-04-30T08:23:39.329670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:23:39.113322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:23:39.414517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:23:39.240876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T08:23:41.462908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
측정망 종류1.0000.0000.7210.6550.7530.738
측정소명0.0001.0000.0000.0000.9510.909
측정항목0.7210.0001.0000.8660.5530.782
측정장비명0.6550.0000.8661.0000.6530.854
최초 설치연도0.7530.9510.5530.6531.0000.167
교체연도0.7380.9090.7820.8540.1671.000
2024-04-30T08:23:41.566092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정항목측정장비명측정망 종류
측정항목1.0000.5260.751
측정장비명0.5261.0000.669
측정망 종류0.7510.6691.000
2024-04-30T08:23:41.655344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최초 설치연도교체연도측정망 종류측정항목측정장비명
최초 설치연도1.0000.0760.5570.3420.373
교체연도0.0761.0000.5870.6340.705
측정망 종류0.5570.5871.0000.7510.669
측정항목0.3420.6340.7511.0000.526
측정장비명0.3730.7050.6690.5261.000

Missing values

2024-04-30T08:23:39.516506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T08:23:39.613457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T08:23:39.708994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
0도시대기광복동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19852010
1도시대기장림동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19792015
2도시대기학장동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19792015
3도시대기덕천동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19882019
4도시대기연산동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19962010
5도시대기대연동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19832014
6도시대기청룡동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19972020
7도시대기전포동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도Thermo iQ seriese19802022
8도시대기태종대SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19962019
9도시대기기장읍SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19992016
측정망 종류측정소명측정항목측정장비명최초 설치연도교체연도
26도시대기명지동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise2020<NA>
27도로변대기온천동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19972018
28도로변대기초량동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise19992018
29도로변대기삼락동SO2, NOx, O3, CO, PM-10, PM-2.5, 풍향, 풍속, 기온, 습도HORIBA 370 serise2021<NA>
30도로변대기우동SO2, NOx, O3, CO, PM-10, PM-2.6, 풍향, 풍속, 기온, 습도HORIBA 370 serise2023<NA>
31중금속학장동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 16<NA>2012
32중금속광안동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsH/V air sampler (SIBATA)20072013
33중금속덕천동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsH/V air sampler (SIBATA)<NA>2013
34중금속연산동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 1620062012
35중금속부곡동Pb, Cd, Cr, Cu, Mn, Fe, Ni,Be,AsMicroPNS HVS 16<NA>2012