Overview

Dataset statistics

Number of variables8
Number of observations32
Missing cells54
Missing cells (%)21.1%
Duplicate rows1
Duplicate rows (%)3.1%
Total size in memory2.3 KiB
Average record size in memory72.1 B

Variable types

Text2
Categorical6

Dataset

Description공단에서 운영하는 시설물 중 매년 정기 검사를 통해 측정하는 실내공기질 관리 결과 미세먼지, 일산화탄소, 이산화탄소 등 항목을 제공합니다.
Author인천광역시계양구시설관리공단
URLhttps://www.data.go.kr/data/15042967/fileData.do

Alerts

Dataset has 1 (3.1%) duplicate rowsDuplicates
포름알데히드(㎍/㎥) is highly overall correlated with 미세먼지(㎍/㎥) and 4 other fieldsHigh correlation
미세먼지(㎍/㎥) is highly overall correlated with 이산화탄소(PPM) and 4 other fieldsHigh correlation
이산화탄소(PPM) is highly overall correlated with 미세먼지(㎍/㎥) and 4 other fieldsHigh correlation
데이터기준일자 is highly overall correlated with 미세먼지(㎍/㎥) and 4 other fieldsHigh correlation
측정결과 is highly overall correlated with 미세먼지(㎍/㎥) and 4 other fieldsHigh correlation
일산화탄소(PPM) is highly overall correlated with 미세먼지(㎍/㎥) and 4 other fieldsHigh correlation
미세먼지(㎍/㎥) is highly imbalanced (61.8%)Imbalance
이산화탄소(PPM) is highly imbalanced (61.8%)Imbalance
포름알데히드(㎍/㎥) is highly imbalanced (61.8%)Imbalance
일산화탄소(PPM) is highly imbalanced (61.8%)Imbalance
시설명 has 27 (84.4%) missing valuesMissing
도로명주소 has 27 (84.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 16:11:41.070967
Analysis finished2023-12-12 16:11:42.165901
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Text

MISSING 

Distinct5
Distinct (%)100.0%
Missing27
Missing (%)84.4%
Memory size388.0 B
2023-12-13T01:11:42.262828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length11.4
Min length8

Characters and Unicode

Total characters57
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st row계양구청사 사무실
2nd row계양구청사 주차장(아-5)
3rd row계양구청사 주차장(사-11)
4th row계산체육공원지하주차장
5th row계양산공영주차장
ValueCountFrequency (%)
계양구청사 3
37.5%
사무실 1
 
12.5%
주차장(아-5 1
 
12.5%
주차장(사-11 1
 
12.5%
계산체육공원지하주차장 1
 
12.5%
계양산공영주차장 1
 
12.5%
2023-12-13T01:11:42.581301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
8.8%
5
 
8.8%
4
 
7.0%
4
 
7.0%
4
 
7.0%
4
 
7.0%
3
 
5.3%
3
 
5.3%
3
 
5.3%
( 2
 
3.5%
Other values (15) 20
35.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45
78.9%
Space Separator 3
 
5.3%
Decimal Number 3
 
5.3%
Open Punctuation 2
 
3.5%
Dash Punctuation 2
 
3.5%
Close Punctuation 2
 
3.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
11.1%
5
11.1%
4
8.9%
4
8.9%
4
8.9%
4
8.9%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
Other values (9) 9
20.0%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
5 1
33.3%
Space Separator
ValueCountFrequency (%)
3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45
78.9%
Common 12
 
21.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
11.1%
5
11.1%
4
8.9%
4
8.9%
4
8.9%
4
8.9%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
Other values (9) 9
20.0%
Common
ValueCountFrequency (%)
3
25.0%
( 2
16.7%
- 2
16.7%
) 2
16.7%
1 2
16.7%
5 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45
78.9%
ASCII 12
 
21.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
11.1%
5
11.1%
4
8.9%
4
8.9%
4
8.9%
4
8.9%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
Other values (9) 9
20.0%
ASCII
ValueCountFrequency (%)
3
25.0%
( 2
16.7%
- 2
16.7%
) 2
16.7%
1 2
16.7%
5 1
 
8.3%

도로명주소
Text

MISSING 

Distinct3
Distinct (%)60.0%
Missing27
Missing (%)84.4%
Memory size388.0 B
2023-12-13T01:11:42.770059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length16.2
Min length15

Characters and Unicode

Total characters81
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)40.0%

Sample

1st row인천시 계양구 계산새로 88
2nd row인천시 계양구 계산새로 88
3rd row인천시 계양구 계산새로 88
4th row인천시 계양구 주부토로 570
5th row인천시 계양구 계양산로 102번길 4
ValueCountFrequency (%)
인천시 5
23.8%
계양구 5
23.8%
계산새로 3
14.3%
88 3
14.3%
주부토로 1
 
4.8%
570 1
 
4.8%
계양산로 1
 
4.8%
102번길 1
 
4.8%
4 1
 
4.8%
2023-12-13T01:11:43.099023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
19.8%
9
11.1%
6
 
7.4%
8 6
 
7.4%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
5
 
6.2%
4
 
4.9%
Other values (12) 15
18.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52
64.2%
Space Separator 16
 
19.8%
Decimal Number 13
 
16.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
17.3%
6
11.5%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
4
7.7%
3
 
5.8%
1
 
1.9%
Other values (4) 4
7.7%
Decimal Number
ValueCountFrequency (%)
8 6
46.2%
0 2
 
15.4%
2 1
 
7.7%
1 1
 
7.7%
7 1
 
7.7%
5 1
 
7.7%
4 1
 
7.7%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52
64.2%
Common 29
35.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
17.3%
6
11.5%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
4
7.7%
3
 
5.8%
1
 
1.9%
Other values (4) 4
7.7%
Common
ValueCountFrequency (%)
16
55.2%
8 6
 
20.7%
0 2
 
6.9%
2 1
 
3.4%
1 1
 
3.4%
7 1
 
3.4%
5 1
 
3.4%
4 1
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52
64.2%
ASCII 29
35.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16
55.2%
8 6
 
20.7%
0 2
 
6.9%
2 1
 
3.4%
1 1
 
3.4%
7 1
 
3.4%
5 1
 
3.4%
4 1
 
3.4%
Hangul
ValueCountFrequency (%)
9
17.3%
6
11.5%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
5
9.6%
4
7.7%
3
 
5.8%
1
 
1.9%
Other values (4) 4
7.7%

미세먼지(㎍/㎥)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
10.9
 
1
23.1
 
1
28.7
 
1
40.35
 
1

Length

Max length5
Median length4
Mean length4.03125
Min length4

Unique

Unique5 ?
Unique (%)15.6%

Sample

1st row10.9
2nd row23.1
3rd row28.7
4th row40.35
5th row54.0

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
10.9 1
 
3.1%
23.1 1
 
3.1%
28.7 1
 
3.1%
40.35 1
 
3.1%
54.0 1
 
3.1%

Length

2023-12-13T01:11:43.279966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:43.403212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
10.9 1
 
3.1%
23.1 1
 
3.1%
28.7 1
 
3.1%
40.35 1
 
3.1%
54.0 1
 
3.1%

이산화탄소(PPM)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
708.0
 
1
513.0
 
1
474.0
 
1
525.0
 
1

Length

Max length5
Median length4
Mean length4.15625
Min length4

Unique

Unique5 ?
Unique (%)15.6%

Sample

1st row708.0
2nd row513.0
3rd row474.0
4th row525.0
5th row414.5

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
708.0 1
 
3.1%
513.0 1
 
3.1%
474.0 1
 
3.1%
525.0 1
 
3.1%
414.5 1
 
3.1%

Length

2023-12-13T01:11:43.525504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:43.645453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
708.0 1
 
3.1%
513.0 1
 
3.1%
474.0 1
 
3.1%
525.0 1
 
3.1%
414.5 1
 
3.1%

포름알데히드(㎍/㎥)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
7.0
 
1
9.1
 
1
8.1
 
1
9.85
 
1

Length

Max length4
Median length4
Mean length3.875
Min length3

Unique

Unique5 ?
Unique (%)15.6%

Sample

1st row7.0
2nd row9.1
3rd row8.1
4th row9.85
5th row5.6

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
7.0 1
 
3.1%
9.1 1
 
3.1%
8.1 1
 
3.1%
9.85 1
 
3.1%
5.6 1
 
3.1%

Length

2023-12-13T01:11:43.802472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:43.904566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
7.0 1
 
3.1%
9.1 1
 
3.1%
8.1 1
 
3.1%
9.85 1
 
3.1%
5.6 1
 
3.1%

일산화탄소(PPM)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
0.4
 
1
1.5
 
1
1.0
 
1
2.3
 
1

Length

Max length4
Median length4
Mean length3.875
Min length3

Unique

Unique5 ?
Unique (%)15.6%

Sample

1st row0.4
2nd row1.5
3rd row1.0
4th row2.3
5th row1.55

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
0.4 1
 
3.1%
1.5 1
 
3.1%
1.0 1
 
3.1%
2.3 1
 
3.1%
1.55 1
 
3.1%

Length

2023-12-13T01:11:44.012169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:44.118740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
0.4 1
 
3.1%
1.5 1
 
3.1%
1.0 1
 
3.1%
2.3 1
 
3.1%
1.55 1
 
3.1%

측정결과
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
적합

Length

Max length4
Median length4
Mean length3.6875
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
적합 5
 
15.6%

Length

2023-12-13T01:11:44.226012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:44.332379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
적합 5
 
15.6%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size388.0 B
<NA>
27 
2021-03-01

Length

Max length10
Median length4
Mean length4.9375
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-03-01
2nd row2021-03-01
3rd row2021-03-01
4th row2021-03-01
5th row2021-03-01

Common Values

ValueCountFrequency (%)
<NA> 27
84.4%
2021-03-01 5
 
15.6%

Length

2023-12-13T01:11:44.428797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:11:44.510789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 27
84.4%
2021-03-01 5
 
15.6%

Correlations

2023-12-13T01:11:44.565481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명도로명주소미세먼지(㎍/㎥)이산화탄소(PPM)포름알데히드(㎍/㎥)일산화탄소(PPM)
시설명1.0001.0001.0001.0001.0001.000
도로명주소1.0001.0001.0001.0001.0001.000
미세먼지(㎍/㎥)1.0001.0001.0001.0001.0001.000
이산화탄소(PPM)1.0001.0001.0001.0001.0001.000
포름알데히드(㎍/㎥)1.0001.0001.0001.0001.0001.000
일산화탄소(PPM)1.0001.0001.0001.0001.0001.000
2023-12-13T01:11:44.652029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
포름알데히드(㎍/㎥)미세먼지(㎍/㎥)이산화탄소(PPM)데이터기준일자측정결과일산화탄소(PPM)
포름알데히드(㎍/㎥)1.0001.0001.0001.0001.0001.000
미세먼지(㎍/㎥)1.0001.0001.0001.0001.0001.000
이산화탄소(PPM)1.0001.0001.0001.0001.0001.000
데이터기준일자1.0001.0001.0001.0001.0001.000
측정결과1.0001.0001.0001.0001.0001.000
일산화탄소(PPM)1.0001.0001.0001.0001.0001.000
2023-12-13T01:11:44.737009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
미세먼지(㎍/㎥)이산화탄소(PPM)포름알데히드(㎍/㎥)일산화탄소(PPM)측정결과데이터기준일자
미세먼지(㎍/㎥)1.0001.0001.0001.0001.0001.000
이산화탄소(PPM)1.0001.0001.0001.0001.0001.000
포름알데히드(㎍/㎥)1.0001.0001.0001.0001.0001.000
일산화탄소(PPM)1.0001.0001.0001.0001.0001.000
측정결과1.0001.0001.0001.0001.0001.000
데이터기준일자1.0001.0001.0001.0001.0001.000

Missing values

2023-12-13T01:11:41.498264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:11:41.641922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:11:42.068506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시설명도로명주소미세먼지(㎍/㎥)이산화탄소(PPM)포름알데히드(㎍/㎥)일산화탄소(PPM)측정결과데이터기준일자
0계양구청사 사무실인천시 계양구 계산새로 8810.9708.07.00.4적합2021-03-01
1계양구청사 주차장(아-5)인천시 계양구 계산새로 8823.1513.09.11.5적합2021-03-01
2계양구청사 주차장(사-11)인천시 계양구 계산새로 8828.7474.08.11.0적합2021-03-01
3계산체육공원지하주차장인천시 계양구 주부토로 57040.35525.09.852.3적합2021-03-01
4계양산공영주차장인천시 계양구 계양산로 102번길 454.0414.55.61.55적합2021-03-01
5<NA><NA><NA><NA><NA><NA><NA><NA>
6<NA><NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA><NA>
8<NA><NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA><NA>
시설명도로명주소미세먼지(㎍/㎥)이산화탄소(PPM)포름알데히드(㎍/㎥)일산화탄소(PPM)측정결과데이터기준일자
22<NA><NA><NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA><NA><NA>
24<NA><NA><NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA><NA><NA><NA>
27<NA><NA><NA><NA><NA><NA><NA><NA>
28<NA><NA><NA><NA><NA><NA><NA><NA>
29<NA><NA><NA><NA><NA><NA><NA><NA>
30<NA><NA><NA><NA><NA><NA><NA><NA>
31<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시설명도로명주소미세먼지(㎍/㎥)이산화탄소(PPM)포름알데히드(㎍/㎥)일산화탄소(PPM)측정결과데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>27