Overview

Dataset statistics

Number of variables10
Number of observations403
Missing cells4
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.4 KiB
Average record size in memory82.3 B

Variable types

DateTime2
Categorical5
Numeric2
Text1

Dataset

Description미세먼지 경보 조회 서비스
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=T0VV0B225CV46E4HSTEE30126364&infSeq=1

Alerts

발령농도(μg/m³) is highly overall correlated with 해제농도(μg/m³) and 2 other fieldsHigh correlation
해제농도(μg/m³) is highly overall correlated with 발령농도(μg/m³) and 2 other fieldsHigh correlation
항목명 is highly overall correlated with 발령농도(μg/m³) and 1 other fieldsHigh correlation
경보단계명 is highly overall correlated with 발령농도(μg/m³) and 1 other fieldsHigh correlation
경보단계명 is highly imbalanced (54.9%)Imbalance

Reproduction

Analysis started2024-05-10 21:32:01.110328
Analysis finished2024-05-10 21:32:03.819682
Duration2.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct136
Distinct (%)33.7%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2018-03-12 00:00:00
Maximum2024-04-18 00:00:00
2024-05-10T21:32:04.027990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:32:04.472543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

권역명
Categorical

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
중부권
112 
남부권
112 
북부권
93 
동부권
86 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row북부권
2nd row북부권
3rd row중부권
4th row동부권
5th row중부권

Common Values

ValueCountFrequency (%)
중부권 112
27.8%
남부권 112
27.8%
북부권 93
23.1%
동부권 86
21.3%

Length

2024-05-10T21:32:04.907375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:32:05.242610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중부권 112
27.8%
남부권 112
27.8%
북부권 93
23.1%
동부권 86
21.3%

항목명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
PM25
209 
PM10
194 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPM10
2nd rowPM10
3rd rowPM10
4th rowPM10
5th rowPM10

Common Values

ValueCountFrequency (%)
PM25 209
51.9%
PM10 194
48.1%

Length

2024-05-10T21:32:05.681461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:32:05.925819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pm25 209
51.9%
pm10 194
48.1%

경보단계명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
주의보
365 
경보
38 

Length

Max length3
Median length3
Mean length2.9057072
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주의보
2nd row주의보
3rd row주의보
4th row주의보
5th row주의보

Common Values

ValueCountFrequency (%)
주의보 365
90.6%
경보 38
 
9.4%

Length

2024-05-10T21:32:06.236057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-10T21:32:06.746279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주의보 365
90.6%
경보 38
 
9.4%
Distinct135
Distinct (%)33.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2018-03-12 00:00:00
Maximum2024-04-18 00:00:00
2024-05-10T21:32:07.068438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:32:07.533211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

발령시간
Categorical

Distinct24
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
11:00
35 
10:00
33 
12:00
29 
13:00
28 
20:00
 
25
Other values (19)
253 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row12:00
2nd row14:00
3rd row14:00
4th row15:00
5th row15:00

Common Values

ValueCountFrequency (%)
11:00 35
 
8.7%
10:00 33
 
8.2%
12:00 29
 
7.2%
13:00 28
 
6.9%
20:00 25
 
6.2%
14:00 25
 
6.2%
15:00 23
 
5.7%
22:00 21
 
5.2%
09:00 20
 
5.0%
16:00 18
 
4.5%
Other values (14) 146
36.2%

Length

2024-05-10T21:32:07.956407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
11:00 35
 
8.7%
10:00 33
 
8.2%
12:00 29
 
7.2%
13:00 28
 
6.9%
20:00 25
 
6.2%
14:00 25
 
6.2%
15:00 23
 
5.7%
22:00 21
 
5.2%
09:00 20
 
5.0%
16:00 18
 
4.5%
Other values (14) 146
36.2%

발령농도(μg/m³)
Real number (ℝ)

HIGH CORRELATION 

Distinct154
Distinct (%)38.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.19107
Minimum64
Maximum564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2024-05-10T21:32:08.322818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum64
5-th percentile75
Q181
median123
Q3174
95-th percentile358.9
Maximum564
Range500
Interquartile range (IQR)93

Descriptive statistics

Standard deviation94.332089
Coefficient of variation (CV)0.62808056
Kurtosis3.408162
Mean150.19107
Median Absolute Deviation (MAD)44
Skewness1.8002489
Sum60527
Variance8898.543
MonotonicityNot monotonic
2024-05-10T21:32:08.951135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
81 21
 
5.2%
77 20
 
5.0%
79 19
 
4.7%
75 13
 
3.2%
78 12
 
3.0%
84 12
 
3.0%
76 12
 
3.0%
80 11
 
2.7%
85 9
 
2.2%
83 9
 
2.2%
Other values (144) 265
65.8%
ValueCountFrequency (%)
64 2
 
0.5%
70 3
 
0.7%
71 2
 
0.5%
72 1
 
0.2%
73 1
 
0.2%
74 2
 
0.5%
75 13
3.2%
76 12
3.0%
77 20
5.0%
78 12
3.0%
ValueCountFrequency (%)
564 1
0.2%
559 1
0.2%
533 1
0.2%
524 1
0.2%
505 1
0.2%
463 1
0.2%
440 1
0.2%
434 1
0.2%
430 1
0.2%
396 1
0.2%
Distinct133
Distinct (%)33.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2024-05-10T21:32:09.418079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.9205955
Min length2

Characters and Unicode

Total characters3998
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)9.9%

Sample

1st row2024-04-18
2nd row2024-04-18
3rd row2024-04-18
4th row2024-04-18
5th row2024-04-17
ValueCountFrequency (%)
2024-03-29 12
 
3.0%
2021-05-09 12
 
3.0%
2018-11-28 10
 
2.5%
2022-12-13 10
 
2.5%
2023-04-07 9
 
2.2%
2019-03-07 9
 
2.2%
2023-11-23 8
 
2.0%
2020-02-22 8
 
2.0%
2021-03-29 8
 
2.0%
2019-12-11 8
 
2.0%
Other values (123) 309
76.7%
2024-05-10T21:32:10.459768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 992
24.8%
0 916
22.9%
- 806
20.2%
1 518
13.0%
3 261
 
6.5%
4 118
 
3.0%
9 115
 
2.9%
8 97
 
2.4%
5 72
 
1.8%
7 71
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3192
79.8%
Dash Punctuation 806
 
20.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 992
31.1%
0 916
28.7%
1 518
16.2%
3 261
 
8.2%
4 118
 
3.7%
9 115
 
3.6%
8 97
 
3.0%
5 72
 
2.3%
7 71
 
2.2%
6 32
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
- 806
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3998
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 992
24.8%
0 916
22.9%
- 806
20.2%
1 518
13.0%
3 261
 
6.5%
4 118
 
3.0%
9 115
 
2.9%
8 97
 
2.4%
5 72
 
1.8%
7 71
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3998
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 992
24.8%
0 916
22.9%
- 806
20.2%
1 518
13.0%
3 261
 
6.5%
4 118
 
3.0%
9 115
 
2.9%
8 97
 
2.4%
5 72
 
1.8%
7 71
 
1.8%

해제시간
Categorical

Distinct25
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
16:00
43 
14:00
35 
17:00
29 
15:00
25 
13:00
 
23
Other values (20)
248 

Length

Max length5
Median length5
Mean length4.9801489
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row16:00
2nd row06:00
3rd row18:00
4th row19:00
5th row02:00

Common Values

ValueCountFrequency (%)
16:00 43
 
10.7%
14:00 35
 
8.7%
17:00 29
 
7.2%
15:00 25
 
6.2%
13:00 23
 
5.7%
12:00 23
 
5.7%
04:00 22
 
5.5%
01:00 21
 
5.2%
19:00 18
 
4.5%
02:00 18
 
4.5%
Other values (15) 146
36.2%

Length

2024-05-10T21:32:10.996736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
16:00 43
 
10.7%
14:00 35
 
8.7%
17:00 29
 
7.2%
15:00 25
 
6.2%
13:00 23
 
5.7%
12:00 23
 
5.7%
04:00 22
 
5.5%
01:00 21
 
5.2%
19:00 18
 
4.5%
02:00 18
 
4.5%
Other values (15) 146
36.2%

해제농도(μg/m³)
Real number (ℝ)

HIGH CORRELATION 

Distinct112
Distinct (%)28.1%
Missing4
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean86.511278
Minimum11
Maximum564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2024-05-10T21:32:11.636888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile25
Q132
median72
Q396
95-th percentile326.9
Maximum564
Range553
Interquartile range (IQR)64

Descriptive statistics

Standard deviation91.952163
Coefficient of variation (CV)1.0628922
Kurtosis9.2663766
Mean86.511278
Median Absolute Deviation (MAD)39
Skewness2.9088235
Sum34518
Variance8455.2002
MonotonicityNot monotonic
2024-05-10T21:32:12.234352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 36
 
8.9%
32 27
 
6.7%
34 27
 
6.7%
31 18
 
4.5%
99 15
 
3.7%
30 15
 
3.7%
29 13
 
3.2%
93 11
 
2.7%
96 11
 
2.7%
98 10
 
2.5%
Other values (102) 216
53.6%
ValueCountFrequency (%)
11 1
 
0.2%
16 1
 
0.2%
20 2
 
0.5%
21 1
 
0.2%
22 2
 
0.5%
23 3
 
0.7%
24 6
1.5%
25 6
1.5%
26 5
1.2%
27 8
2.0%
ValueCountFrequency (%)
564 1
0.2%
559 1
0.2%
533 1
0.2%
524 1
0.2%
505 1
0.2%
440 1
0.2%
434 1
0.2%
430 1
0.2%
396 1
0.2%
392 1
0.2%

Interactions

2024-05-10T21:32:02.318620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:32:01.819247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:32:02.642243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-10T21:32:02.064498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-10T21:32:12.617336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역명항목명경보단계명발령시간발령농도(μg/m³)해제시간해제농도(μg/m³)
권역명1.0000.1430.0000.0000.0150.0000.000
항목명0.1431.0000.2030.3460.9920.2440.986
경보단계명0.0000.2031.0000.2470.8270.0740.755
발령시간0.0000.3460.2471.0000.3960.3820.424
발령농도(μg/m³)0.0150.9920.8270.3961.0000.2390.774
해제시간0.0000.2440.0740.3820.2391.0000.232
해제농도(μg/m³)0.0000.9860.7550.4240.7740.2321.000
2024-05-10T21:32:12.969769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
권역명발령시간경보단계명항목명해제시간
권역명1.0000.0000.0000.0940.000
발령시간0.0001.0000.1900.2670.105
경보단계명0.0000.1901.0000.1300.061
항목명0.0940.2670.1301.0000.204
해제시간0.0000.1050.0610.2041.000
2024-05-10T21:32:13.302779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발령농도(μg/m³)해제농도(μg/m³)권역명항목명경보단계명발령시간해제시간
발령농도(μg/m³)1.0000.7840.0050.9110.6530.1530.084
해제농도(μg/m³)0.7841.0000.0000.8870.5760.1590.081
권역명0.0050.0001.0000.0940.0000.0000.000
항목명0.9110.8870.0941.0000.1300.2670.204
경보단계명0.6530.5760.0000.1301.0000.1900.061
발령시간0.1530.1590.0000.2670.1901.0000.105
해제시간0.0840.0810.0000.2040.0610.1051.000

Missing values

2024-05-10T21:32:03.064764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-10T21:32:03.613677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준일자권역명항목명경보단계명발령일자발령시간발령농도(μg/m³)해제일자해제시간해제농도(μg/m³)
02024-04-18북부권PM10주의보2024-04-1812:001562024-04-1816:0093
12024-04-17북부권PM10주의보2024-04-1714:002212024-04-1806:0099
22024-04-17중부권PM10주의보2024-04-1714:001992024-04-1818:0080
32024-04-17동부권PM10주의보2024-04-1715:002232024-04-1819:0084
42024-04-16중부권PM10주의보2024-04-1615:001602024-04-1702:0096
52024-04-16동부권PM10주의보2024-04-1616:001902024-04-1704:0098
62024-04-16남부권PM10주의보2024-04-1615:001832024-04-1819:0099
72024-03-29중부권PM10주의보2024-03-2902:003392024-03-2903:00346
82024-03-29북부권PM10경보2024-03-2904:003622024-03-2914:00117
92024-03-29동부권PM10주의보2024-03-2902:002742024-03-2904:00370
기준일자권역명항목명경보단계명발령일자발령시간발령농도(μg/m³)해제일자해제시간해제농도(μg/m³)
3932018-03-25중부권PM10주의보2018-03-2513:001722018-03-2600:0093
3942018-03-25남부권PM10주의보2018-03-2505:001542018-03-2518:0097
3952018-03-24남부권PM25주의보2018-03-2410:00952018-03-2616:0047
3962018-03-24북부권PM25주의보2018-03-2410:00972018-03-2618:0044
3972018-03-24동부권PM25주의보2018-03-2410:00912018-03-2616:0048
3982018-03-24중부권PM25주의보2018-03-2421:00972018-03-2616:0045
3992018-03-12남부권PM25주의보2018-03-1220:00972018-03-1313:0035
4002018-03-12동부권PM25주의보2018-03-1219:00942018-03-1313:0041
4012018-03-12중부권PM10주의보2018-03-1216:001512018-03-1303:0098
4022018-03-12중부권PM25주의보2018-03-1215:00942018-03-1308:0048