Overview

Dataset statistics

Number of variables7
Number of observations102
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.8 KiB
Average record size in memory58.3 B

Variable types

Numeric1
Categorical6

Dataset

Description대전교통공사 역사에 설치된 공기질측정장비 적합 평가 결과에 따른 정보데이터(22개 역사에 설치된 공기질측정장비 적합 평가 결과)
Author대전교통공사
URLhttps://www.data.go.kr/data/15053141/fileData.do

Alerts

장비명 is highly overall correlated with 모델명High correlation
정도검사일 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
검사기관 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
모델명 is highly overall correlated with 장비명High correlation
연도 is highly overall correlated with 검사기관 and 1 other fieldsHigh correlation
검사결과 is highly imbalanced (92.1%)Imbalance

Reproduction

Analysis started2023-12-12 10:37:44.854593
Analysis finished2023-12-12 10:37:45.714229
Duration0.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.9706
Minimum2015
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-12T19:37:45.778328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2015
Q12017
median2019
Q32021
95-th percentile2023
Maximum2023
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.4993301
Coefficient of variation (CV)0.001237923
Kurtosis-1.1737529
Mean2018.9706
Median Absolute Deviation (MAD)2
Skewness-0.059076572
Sum205935
Variance6.2466511
MonotonicityIncreasing
2023-12-12T19:37:45.921423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2020 13
12.7%
2021 13
12.7%
2019 12
11.8%
2022 12
11.8%
2015 11
10.8%
2016 11
10.8%
2017 11
10.8%
2018 11
10.8%
2023 8
7.8%
ValueCountFrequency (%)
2015 11
10.8%
2016 11
10.8%
2017 11
10.8%
2018 11
10.8%
2019 12
11.8%
2020 13
12.7%
2021 13
12.7%
2022 12
11.8%
2023 8
7.8%
ValueCountFrequency (%)
2023 8
7.8%
2022 12
11.8%
2021 13
12.7%
2020 13
12.7%
2019 12
11.8%
2018 11
10.8%
2017 11
10.8%
2016 11
10.8%
2015 11
10.8%

설치장소
Categorical

Distinct6
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size948.0 B
대전역
35 
시청역
35 
월평역
터널내
월드컵경기장역

Length

Max length7
Median length3
Mean length3.3137255
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row판암역
2nd row대전역
3rd row대전역
4th row대전역
5th row대전역

Common Values

ValueCountFrequency (%)
대전역 35
34.3%
시청역 35
34.3%
월평역 9
 
8.8%
터널내 8
 
7.8%
월드컵경기장역 8
 
7.8%
판암역 7
 
6.9%

Length

2023-12-12T19:37:46.097278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:46.238765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대전역 35
34.3%
시청역 35
34.3%
월평역 9
 
8.8%
터널내 8
 
7.8%
월드컵경기장역 8
 
7.8%
판암역 7
 
6.9%

장비명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size948.0 B
미세먼지
45 
이산화탄소
21 
일산화탄소
18 
이산화질소
18 

Length

Max length5
Median length5
Mean length4.5588235
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미세먼지
2nd row미세먼지
3rd row이산화탄소
4th row일산화탄소
5th row이산화질소

Common Values

ValueCountFrequency (%)
미세먼지 45
44.1%
이산화탄소 21
20.6%
일산화탄소 18
 
17.6%
이산화질소 18
 
17.6%

Length

2023-12-12T19:37:46.389991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:46.528098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미세먼지 45
44.1%
이산화탄소 21
20.6%
일산화탄소 18
 
17.6%
이산화질소 18
 
17.6%

모델명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size948.0 B
FH62_C14
45 
410i
21 
48i
18 
42i
18 

Length

Max length8
Median length4
Mean length5.4117647
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFH62_C14
2nd rowFH62_C14
3rd row410i
4th row48i
5th row42i

Common Values

ValueCountFrequency (%)
FH62_C14 45
44.1%
410i 21
20.6%
48i 18
 
17.6%
42i 18
 
17.6%

Length

2023-12-12T19:37:46.796470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:47.061743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
fh62_c14 45
44.1%
410i 21
20.6%
48i 18
 
17.6%
42i 18
 
17.6%

검사기관
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size948.0 B
한국산업기술시험원
76 
한국표준과학연구원
26 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한국산업기술시험원
2nd row한국산업기술시험원
3rd row한국산업기술시험원
4th row한국산업기술시험원
5th row한국산업기술시험원

Common Values

ValueCountFrequency (%)
한국산업기술시험원 76
74.5%
한국표준과학연구원 26
 
25.5%

Length

2023-12-12T19:37:47.215294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:47.379304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한국산업기술시험원 76
74.5%
한국표준과학연구원 26
 
25.5%

정도검사일
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)11.8%
Missing0
Missing (%)0.0%
Memory size948.0 B
2020-07-01
13 
2022-06-25
12 
2015-06-22
11 
2016-06-24
11 
2017-06-26
11 
Other values (7)
44 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row2015-06-22
2nd row2015-06-22
3rd row2015-06-22
4th row2015-06-22
5th row2015-06-22

Common Values

ValueCountFrequency (%)
2020-07-01 13
12.7%
2022-06-25 12
11.8%
2015-06-22 11
10.8%
2016-06-24 11
10.8%
2017-06-26 11
10.8%
2018-07-17 11
10.8%
2019-07-11 11
10.8%
2021-06-17 9
8.8%
2023-06-27 7
6.9%
2021-06-23 4
 
3.9%
Other values (2) 2
 
2.0%

Length

2023-12-12T19:37:47.546525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-07-01 13
12.7%
2022-06-25 12
11.8%
2015-06-22 11
10.8%
2016-06-24 11
10.8%
2017-06-26 11
10.8%
2018-07-17 11
10.8%
2019-07-11 11
10.8%
2021-06-17 9
8.8%
2023-06-27 7
6.9%
2021-06-23 4
 
3.9%
Other values (2) 2
 
2.0%

검사결과
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size948.0 B
적합
101 
부적합
 
1

Length

Max length3
Median length2
Mean length2.0098039
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 101
99.0%
부적합 1
 
1.0%

Length

2023-12-12T19:37:47.732078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:37:47.882995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 101
99.0%
부적합 1
 
1.0%

Interactions

2023-12-12T19:37:45.329197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:37:47.979322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도설치장소장비명모델명검사기관정도검사일검사결과
연도1.0000.0000.0000.0001.0001.0000.000
설치장소0.0001.0000.4960.4960.0000.0000.000
장비명0.0000.4961.0001.0000.0000.0000.198
모델명0.0000.4961.0001.0000.0000.0000.198
검사기관1.0000.0000.0000.0001.0001.0000.000
정도검사일1.0000.0000.0000.0001.0001.0000.000
검사결과0.0000.0000.1980.1980.0000.0001.000
2023-12-12T19:37:48.155070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치장소장비명정도검사일검사기관모델명검사결과
설치장소1.0000.3370.0000.0000.3370.000
장비명0.3371.0000.0000.0001.0000.129
정도검사일0.0000.0001.0000.9490.0000.000
검사기관0.0000.0000.9491.0000.0000.000
모델명0.3371.0000.0000.0001.0000.129
검사결과0.0000.1290.0000.0000.1291.000
2023-12-12T19:37:48.338286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도설치장소장비명모델명검사기관정도검사일검사결과
연도1.0000.0000.0000.0000.9640.9840.000
설치장소0.0001.0000.3370.3370.0000.0000.000
장비명0.0000.3371.0001.0000.0000.0000.129
모델명0.0000.3371.0001.0000.0000.0000.129
검사기관0.9640.0000.0000.0001.0000.9490.000
정도검사일0.9840.0000.0000.0000.9491.0000.000
검사결과0.0000.0000.1290.1290.0000.0001.000

Missing values

2023-12-12T19:37:45.512330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:37:45.661605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도설치장소장비명모델명검사기관정도검사일검사결과
02015판암역미세먼지FH62_C14한국산업기술시험원2015-06-22적합
12015대전역미세먼지FH62_C14한국산업기술시험원2015-06-22적합
22015대전역이산화탄소410i한국산업기술시험원2015-06-22적합
32015대전역일산화탄소48i한국산업기술시험원2015-06-22적합
42015대전역이산화질소42i한국산업기술시험원2015-06-22적합
52015시청역미세먼지FH62_C14한국산업기술시험원2015-06-22적합
62015시청역이산화탄소410i한국산업기술시험원2015-06-22적합
72015시청역일산화탄소48i한국산업기술시험원2015-06-22적합
82015시청역이산화질소42i한국산업기술시험원2015-06-22적합
92015월평역미세먼지FH62_C14한국산업기술시험원2015-06-22적합
연도설치장소장비명모델명검사기관정도검사일검사결과
922022대전역일산화탄소48i한국산업기술시험원2022-06-25적합
932022시청역이산화질소42i한국산업기술시험원2022-06-25적합
942023월드컵경기장역이산화탄소410i한국산업기술시험원2023-06-27적합
952023월평역미세먼지FH62_C14한국산업기술시험원2023-06-27적합
962023월드컵경기장역미세먼지FH62_C14한국산업기술시험원2023-06-27적합
972023대전역미세먼지FH62_C14한국산업기술시험원2023-06-27적합
982023대전역이산화탄소410i한국산업기술시험원2023-06-27적합
992023시청역일산화탄소48i한국산업기술시험원2023-06-27적합
1002023대전역일산화탄소48i한국산업기술시험원2023-07-12적합
1012023시청역이산화질소42i한국산업기술시험원2023-06-27적합