Overview

Dataset statistics

Number of variables12
Number of observations49
Missing cells57
Missing cells (%)9.7%
Duplicate rows3
Duplicate rows (%)6.1%
Total size in memory4.9 KiB
Average record size in memory101.7 B

Variable types

Text1
Categorical2
Unsupported6
Numeric3

Dataset

Description강원도 원주시의 2020년 월별 미세먼지 농도(PM-10)측정 결과입니다. EX)1월 중앙동, 반곡동, 문막읍, 도시평균의 미세먼지 농도(PM-10)관련 다양한 측정결과)
Author강원도 원주시
URLhttps://www.data.go.kr/data/15092042/fileData.do

Alerts

Dataset has 3 (6.1%) duplicate rowsDuplicates
도시명 is highly overall correlated with 유효 측정일수 and 3 other fieldsHigh correlation
측정소명 is highly overall correlated with 유효 측정일수 and 2 other fieldsHigh correlation
유효 측정일수 is highly overall correlated with 유효 측정시간 and 2 other fieldsHigh correlation
유효 측정시간 is highly overall correlated with 유효 측정일수 and 2 other fieldsHigh correlation
월평균 (㎍/㎥) is highly overall correlated with 도시명High correlation
시,도명 has 37 (75.5%) missing valuesMissing
유효자료 획득율(%) has 1 (2.0%) missing valuesMissing
유효 측정일수 has 3 (6.1%) missing valuesMissing
유효 측정시간 has 3 (6.1%) missing valuesMissing
월평균 (㎍/㎥) has 3 (6.1%) missing valuesMissing
24시간치 has 2 (4.1%) missing valuesMissing
Unnamed: 8 has 2 (4.1%) missing valuesMissing
Unnamed: 9 has 2 (4.1%) missing valuesMissing
Unnamed: 10 has 2 (4.1%) missing valuesMissing
Unnamed: 11 has 2 (4.1%) missing valuesMissing
유효자료 획득율(%) is an unsupported type, check if it needs cleaning or further analysisUnsupported
24시간치 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 03:19:10.185485
Analysis finished2023-12-12 03:19:12.037327
Duration1.85 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시,도명
Text

MISSING 

Distinct12
Distinct (%)100.0%
Missing37
Missing (%)75.5%
Memory size524.0 B
2023-12-12T12:19:12.189665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.25
Min length2

Characters and Unicode

Total characters27
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)100.0%

Sample

1st row1월
2nd row2월
3rd row3월
4th row4월
5th row5월
ValueCountFrequency (%)
1월 1
8.3%
2월 1
8.3%
3월 1
8.3%
4월 1
8.3%
5월 1
8.3%
6월 1
8.3%
7월 1
8.3%
8월 1
8.3%
9월 1
8.3%
10월 1
8.3%
Other values (2) 2
16.7%
2023-12-12T12:19:12.573041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
44.4%
1 5
18.5%
2 2
 
7.4%
3 1
 
3.7%
4 1
 
3.7%
5 1
 
3.7%
6 1
 
3.7%
7 1
 
3.7%
8 1
 
3.7%
9 1
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15
55.6%
Other Letter 12
44.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
33.3%
2 2
 
13.3%
3 1
 
6.7%
4 1
 
6.7%
5 1
 
6.7%
6 1
 
6.7%
7 1
 
6.7%
8 1
 
6.7%
9 1
 
6.7%
0 1
 
6.7%
Other Letter
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15
55.6%
Hangul 12
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
33.3%
2 2
 
13.3%
3 1
 
6.7%
4 1
 
6.7%
5 1
 
6.7%
6 1
 
6.7%
7 1
 
6.7%
8 1
 
6.7%
9 1
 
6.7%
0 1
 
6.7%
Hangul
ValueCountFrequency (%)
12
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
55.6%
Hangul 12
44.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
100.0%
ASCII
ValueCountFrequency (%)
1 5
33.3%
2 2
 
13.3%
3 1
 
6.7%
4 1
 
6.7%
5 1
 
6.7%
6 1
 
6.7%
7 1
 
6.7%
8 1
 
6.7%
9 1
 
6.7%
0 1
 
6.7%

도시명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size524.0 B
<NA>
37 
원주시
12 

Length

Max length4
Median length4
Mean length3.755102
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row원주시
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 37
75.5%
원주시 12
 
24.5%

Length

2023-12-12T12:19:12.729972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:19:12.855780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 37
75.5%
원주시 12
 
24.5%

측정소명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size524.0 B
중앙동
12 
반곡동
12 
문막읍
12 
도시평균
12 
<NA>
 
1

Length

Max length4
Median length3
Mean length3.2653061
Min length3

Unique

Unique1 ?
Unique (%)2.0%

Sample

1st row<NA>
2nd row중앙동
3rd row반곡동
4th row문막읍
5th row도시평균

Common Values

ValueCountFrequency (%)
중앙동 12
24.5%
반곡동 12
24.5%
문막읍 12
24.5%
도시평균 12
24.5%
<NA> 1
 
2.0%

Length

2023-12-12T12:19:13.000078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:19:13.161743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중앙동 12
24.5%
반곡동 12
24.5%
문막읍 12
24.5%
도시평균 12
24.5%
na 1
 
2.0%

유효자료 획득율(%)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1
Missing (%)2.0%
Memory size524.0 B

유효 측정일수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)30.4%
Missing3
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean43.782609
Minimum16
Maximum93
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-12T12:19:13.303280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile27
Q130
median31
Q351.25
95-th percentile93
Maximum93
Range77
Interquartile range (IQR)21.25

Descriptive statistics

Standard deviation25.040335
Coefficient of variation (CV)0.57192423
Kurtosis-0.19719112
Mean43.782609
Median Absolute Deviation (MAD)1
Skewness1.2663854
Sum2014
Variance627.01836
MonotonicityNot monotonic
2023-12-12T12:19:13.465494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
31 16
32.7%
30 9
18.4%
93 4
 
8.2%
29 4
 
8.2%
27 3
 
6.1%
90 2
 
4.1%
28 1
 
2.0%
86 1
 
2.0%
89 1
 
2.0%
16 1
 
2.0%
Other values (4) 4
 
8.2%
(Missing) 3
 
6.1%
ValueCountFrequency (%)
16 1
 
2.0%
27 3
 
6.1%
28 1
 
2.0%
29 4
 
8.2%
30 9
18.4%
31 16
32.7%
58 1
 
2.0%
59 1
 
2.0%
75 1
 
2.0%
86 1
 
2.0%
ValueCountFrequency (%)
93 4
 
8.2%
90 2
 
4.1%
89 1
 
2.0%
88 1
 
2.0%
86 1
 
2.0%
75 1
 
2.0%
59 1
 
2.0%
58 1
 
2.0%
31 16
32.7%
30 9
18.4%

유효 측정시간
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct34
Distinct (%)73.9%
Missing3
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean1050.7391
Minimum383
Maximum2216
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-12T12:19:13.632471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum383
5-th percentile685.5
Q1715
median736.5
Q31249
95-th percentile2212.75
Maximum2216
Range1833
Interquartile range (IQR)534

Descriptive statistics

Standard deviation599.22129
Coefficient of variation (CV)0.5702855
Kurtosis-0.22627225
Mean1050.7391
Median Absolute Deviation (MAD)24.5
Skewness1.2595886
Sum48334
Variance359066.15
MonotonicityNot monotonic
2023-12-12T12:19:13.832504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
740 5
 
10.2%
717 3
 
6.1%
730 2
 
4.1%
715 2
 
4.1%
709 2
 
4.1%
741 2
 
4.1%
736 2
 
4.1%
2216 2
 
4.1%
2063 1
 
2.0%
739 1
 
2.0%
Other values (24) 24
49.0%
(Missing) 3
 
6.1%
ValueCountFrequency (%)
383 1
2.0%
681 1
2.0%
685 1
2.0%
687 1
2.0%
691 1
2.0%
695 1
2.0%
697 1
2.0%
698 1
2.0%
701 1
2.0%
709 2
4.1%
ValueCountFrequency (%)
2216 2
4.1%
2213 1
2.0%
2212 1
2.0%
2166 1
2.0%
2149 1
2.0%
2141 1
2.0%
2140 1
2.0%
2063 1
2.0%
1798 1
2.0%
1435 1
2.0%

월평균 (㎍/㎥)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct23
Distinct (%)50.0%
Missing3
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean31.782609
Minimum15
Maximum48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-12T12:19:14.005662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile17
Q125.25
median33
Q338
95-th percentile43
Maximum48
Range33
Interquartile range (IQR)12.75

Descriptive statistics

Standard deviation8.991139
Coefficient of variation (CV)0.28289493
Kurtosis-0.93381235
Mean31.782609
Median Absolute Deviation (MAD)6
Skewness-0.43498647
Sum1462
Variance80.84058
MonotonicityNot monotonic
2023-12-12T12:19:14.185589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
38 7
14.3%
17 4
 
8.2%
43 4
 
8.2%
39 3
 
6.1%
18 3
 
6.1%
33 3
 
6.1%
29 2
 
4.1%
36 2
 
4.1%
32 2
 
4.1%
30 2
 
4.1%
Other values (13) 14
28.6%
(Missing) 3
 
6.1%
ValueCountFrequency (%)
15 1
 
2.0%
17 4
8.2%
18 3
6.1%
20 1
 
2.0%
23 1
 
2.0%
24 1
 
2.0%
25 1
 
2.0%
26 1
 
2.0%
27 1
 
2.0%
29 2
4.1%
ValueCountFrequency (%)
48 1
 
2.0%
43 4
8.2%
42 1
 
2.0%
41 1
 
2.0%
40 1
 
2.0%
39 3
6.1%
38 7
14.3%
36 2
 
4.1%
35 2
 
4.1%
33 3
6.1%

24시간치
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)4.1%
Memory size524.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)4.1%
Memory size524.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)4.1%
Memory size524.0 B

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)4.1%
Memory size524.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2
Missing (%)4.1%
Memory size524.0 B

Interactions

2023-12-12T12:19:11.068841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.459604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.783923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:11.174400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.564440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.882374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:11.271931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.684833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:19:10.984391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:19:14.359087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시,도명측정소명유효 측정일수유효 측정시간월평균 (㎍/㎥)
시,도명1.000NaNNaNNaN1.000
측정소명NaN1.0000.6160.8700.000
유효\n측정일수NaN0.6161.0001.0000.772
유효\n측정시간NaN0.8701.0001.0000.000
월평균\n(㎍/㎥)1.0000.0000.7720.0001.000
2023-12-12T12:19:14.590953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도시명측정소명
도시명1.0001.000
측정소명1.0001.000
2023-12-12T12:19:14.689967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유효 측정일수유효 측정시간월평균 (㎍/㎥)도시명측정소명
유효\n측정일수1.0000.9710.0181.0000.537
유효\n측정시간0.9711.0000.0161.0000.537
월평균\n(㎍/㎥)0.0180.0161.0001.0000.000
도시명1.0001.0001.0001.0001.000
측정소명0.5370.5370.0001.0001.000

Missing values

2023-12-12T12:19:11.428716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:19:11.653425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:19:11.872635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시,도명도시명측정소명유효자료 획득율(%)유효 측정일수유효 측정시간월평균 (㎍/㎥)24시간치Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11
0<NA><NA><NA>NaN<NA><NA><NA>최저\n(㎍/㎥)최고\n(㎍/㎥)최고일시\n(년월일시)기준초과\n(회)초과율\n(%)
11월원주시중앙동98.9231736361652020011100
2<NA><NA>반곡동99.4631740437792020012400
3<NA><NA>문막읍99.4631740389702020012400
4<NA><NA>도시평균99.28932216391792020012400
52월원주시중앙동98.4229685304702020020200
6<NA><NA>반곡동98.71286874111852020020200
7<NA><NA>문막읍99.2829691389742020020200
8<NA><NA>도시평균98.8862063364852020020200
93월원주시중앙동98.12317303212542020032500
시,도명도시명측정소명유효자료 획득율(%)유효 측정일수유효 측정시간월평균 (㎍/㎥)24시간치Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11
39<NA><NA>문막읍99.46317402910652020102700
40<NA><NA>도시평균99.289322163110762020102700
4111월원주시중앙동98.47307094013932020111600
42<NA><NA>반곡동99.31307153812852020111600
43<NA><NA>문막읍99.58307173515782020111600
44<NA><NA>도시평균99.129021413812932020111600
4512월원주시중앙동98.79317354314882020121100
46<NA><NA>반곡동99.46317404318852020121100
47<NA><NA>문막읍99.06317373918732020121100
48<NA><NA>도시평균99.19322124214882020121100

Duplicate rows

Most frequently occurring

시,도명도시명측정소명유효 측정일수유효 측정시간월평균 (㎍/㎥)# duplicates
0<NA><NA>문막읍30717352
1<NA><NA>반곡동31740432
2<NA><NA>반곡동<NA><NA><NA>2